BOCOIBII BBSOIB 



BO 135 636 



IH 006 063 



TITLE 

INSTITUTION 

SSOhS A6BNCX 

FDB DATE 
NOIE 

EDfiS FfilCE 
DESCBIFTOBS 



IDENIIFIEBS 



The £DDa4l CoDfereDce on Large-Scale Assessient: 
Forial Papers and Selected Bibliography (Siitfa, 
Boulder, Colorado, June 14-17, 1976). 
EdacatioD CoiiissioD of the States, Denver, Colo. 
National Assessnent of Edacationai Progress. 
National Center for Education Statistics (CHEN) , 
Nashington, B.C. 
£Jan 76} 
139p. 

Hf-$0.63 HC-$7,35 Plus Postage, 

Acadeiic Acfaie?eient; Agency Bole; College Entrance 
Eiaiinations; *Conferenc<» Beports; ^Educational 
Assessient; Eleientary Secondary Education; Folloyup 
Studies; Hypothesis Testing; Inforiation Utilization; 
Itei Saipling; Kindergarten; Hatbeiatics; Heasureient 
Techniques; Needs Assessient; Perforiance Tests; 
Questionnaires; School Districts; skill Developient; 
Standards; state Agencies; state Depaxtaents of 
Education? estate Prograis; Testing Probleis; Testing 
Prograis 

AAHBEB Cooperative Health Education lest; ACT 
Assesssent Program; Delaware Educational Assessient 
Prograi; lova Assessment Program; Michigan 
Educational Assessient Prograi; ^National Assessient 
of Educational Progress; Nebraska Assessient Battery 
Essential Learn Skills; Pennsylvania Educational 
Quality Assessient 



AfSIBACT 

for the past sii years the National Assessient of 
Educational Progress has sponsored a national Conference on 
Large-scale Assessient, designed to proiote and iiiprove 
coiiunications aiong educational assessient personnel in state 
Departients of Education and other agencies. This voluie contains 
icst of the papers that vere accepted for presentation at the 
half*day forial paper session. The 11 papers included here are: (1) 
**Ihe state Agency as a Besource in Local Needs Assessient** by Paula 
T* Brictson; (2) **Sstablishing Criterion Levels for Judging the 
Acceptability of Assessient Besults** by Iris Neiss and larry Conaway; 
(3) **N*Abels"A Hanageable Technique for Honitoring t^e Acquisition 
of Essential Learning Skills** by Harriet A* Egextson and Hugh A* 
Harlan; (4) **A Process for Developing, Iipleienting and Fclloiiing 
Through on an Assessient Prograi in Fifth* and Eighth*Grade 
Hatheiatics** by Hai Horrison; (5) "Educational Quality Ass3ssient 
Follow^Dp Survey of the 1974 Assessment** by Joyce S. Kii; (6) 
**Hypothesis-Testing in Large-Scale Assessaent** by frank if* ilivas; (7) 
*'A Flan for Utilization of Assessient Data hy Local Education 
Agencies'' by John A. Jones and Charles D. Oviatt; (8) **ACT lest Data 
and Prograi Assessient for Large School Districts** by Bobert Craier; 
(9) An Eiaiple of the Dse of Hultiple Hatrii Saipling Procedures in a 
Local District Asse^^ient Program** by carl D. Novak; (10) 
**Heasurenent Problems and Issues Belated to Applied Performance 
Testing** by Jaies fi* Sanders; and (11) **Syiiposiui on: Large-Scale 
Assessient fieporting and Usage: Delaware and Georgia as Eieiplarj5" by 
fiobert Bigelow and Hervey Scudder. (Author) 



^ . Documents acquired by ERIC Indude many Infoirial unpublished materiaU not available ftom other sources. ERIC makes every 
LHJC rfort to obtain the best copy available. Nevertheless, items of marginal reptodudbiUty are often encountered and this affects the 
j ^ i ^K^f i Lj^ oggty of the microfiche and hardcopy reproductions ERIC makes available via the ERIC Document Reproduction Service (EDRS). 

— ^.«.u*,* A« Mmnaj /inmim^nt pAniTA/iiiatiAnfi ftUDnllfid bv £DRS the best that can be made from 



theSLvth 

Aimual CoiiflHi>iiev on 

LARGE-SCALE ASESMENT 



formal ra|?<jr$ 3n4 ^\aoY^d Sblbgra|7hy 





NATIONAL ASSESSMENT-GF EDUGATIONAL PROGRESS -I 



The Sixth Annual Conference 
on 

LARGi-SCALE ASSESSMENT 

FORA/iAL PAPERS AND 
SELECTED BIBLIOGRAPHY 



June 14-17, 1976 
Harvest House Hotel 
Boulder, Colorado 



Sponsored by 

Department of Field Services 
National Assessment of Educational Progress 

A Project of the Education Commission of the States 
Funded by 

The National Center for Education Statistics 
Department of Health* Education and Welfare 



NATIONAL ASSESSMENT OF EDUCATIONAL PROGRESS 
Suite 700, I860 Lincoln Street 
Denver* Colorado 80203 

Roy It IFoOteSf Oirector 



Conference CchDirecton 
Frank B. Womer* University of Michigan 
Irvin J- Lehmann* Michigan State University 
Jack G. Schmidt* National Assessment 



Vfis report has been produced using quaiiry recycled 20 lb, bond paper* For addiuoncl copies. w,nte to the address above. 



4 



PREFACE 



For the past six years the National Assessment of Educational 
Progress has sponsored a national Conference on Large-Scale 
Asse^l^nt. Designed to promote and 1n^)rove conriunlcatlons 
amgv^ educational assessment personnel In state Oep^iHments of 
Education and- other agencies, the conference^has-experlenced 
a steady growth In Interest* as evidenced by the number or 
attendees whtch rose from 17 In 1971 to nearly 200 In 1976. 

With this growth have come other changes, mainly In the format 
and- In the substance of the programs. Earlier programs xere 
quite informal, permitting attendees to become familiar with 
basic features of other state programs and to share with 
colleagues their experiences and frustrations. Hore recently 
the progrsmi has bec(»ne more structured, with additional topics 
being covered each year and with more attendees taking active 
roles In formal presentations or leading discussion groups. 

The 1976 program* held In Boulder* Colorado, June 14-17* added 
yet another feature to this evolving conference;^ For the first 
time* a half-day of the three-day program was devoted exclusive- 
ly to presentation of formal papers on a variety of assessment 
topics. Invitations to submit papers for this portion of the 
program were extended to previous attendees, various university 
personnel and to others who had expressed Interest In problems 
of educational assessment. It was anticipated that this volun- 
teer paper presentation program would stimulate assessment per- 
sonnel to begin documenting some of the Innovative and useful 
procedures that have emerged from current assessment activities. 

This volume. Formal Papers and Selected Bibliography * contains 
most of the papers that were accepted for presentation at the 
half-day formal paper session. Except for standardizing type 
size for titles* and making minor changes In formatting* the 
papers appear as they were submitted by the authors. 

Papers were reviewed by a panel of readers* chaired by Jim Impara, 
formerly of the Oregon assessment program and now a faculty member 
at Virginia Polytechnic Institute. Readers Included Dave Bayless 
of the Research Triangle Institute* Don Searls of National Assess- 
ment* Lorrie Shepard of the University of Colorado, Gordon Ascher* 
formerly of the New Jersey assessment program and now with the 
Oregon Department of Education* Bill Burson of the California 
assessment program and John Adams, formerly of the Minnesota 
assessment program and now with the Council of Chief State School 
Dfflcers. To each of them goes ei sincere expression of gratitude 
for their contributions In helping make the paper sessions a very 
successful first effort. 



Authors of submitted papers also deserve our thanks for their work* 
for without papers and without their permission to publish them, 
this document would not be possible. 

Finally, I wish to use this opportunity to thank two persons who, 
with me, served as co-directors for the overall conference program 
" Frank Womer of the University of Michigan and Irv Lehmann of 
Michigan State University. Their leadership In guiding this and 
previous conferences has been of iitmeasurable value. 



JACK 6. SCHMIDT, Director 

Department of Field Services 

National Assessment of Blucational Progress 



6 



TABLE OF CONTENTS 



THE STATE AGENCY AS A RESOURCE IN LOCAL NEEOS ASSESSMENT - 

Paula T. Brictson 1 

ESTABLISHING CRITERION LEVELS FOR JUDGING 
THE ACCEPTABILiTY-OF -ASSESSMENT RESUL-TS - 

Iris Weiss and Larry Conaway.... 21 

N-ABELS " A MANAGEABLE TECHNIQUE FOR MONITORING 
THE ACQUISITION OF ESSENTIAL LEARNING SKILLS 

Harriet A. Egertson and Hugh A- Harlan 39 

A PROCESS FOR DEVELOPING* IMPLEMENTING AND FOLLOWING 
THROUGH ON AN ASSESSMENT PROGRAM IN FIFTH- AND 
EIGHTK-GRAOE MATHEMATICS 

Max Morrison 47 

EDUCATIONAL QUALITY ASSESSMENT 
FOLLOW-UP SURVEY OF THE 1974 ASSESSMENT 

Joyce S. Xim 53 

HYPOTHESIS-TESTING IN URGE-SCALE ASSESSMENT 

Frank W. Rivas 73 

A PLAN FOR UTILIZATION OF ASSESSMENT DATA 
BY LOCAL EDUCATION AGENCIES 

John A. Jones and Charles D. Oviatt 79 

ACT TEST DATA AND PROGRAM ASSESSMENT 
FOR URGE SCHOOL DISTRICTS 

Robert Cramer 87 

AN EXAMPLE OF THE USE OF MULTIPLE MATRIX SAMPLING 
PROCEDURES IN A LOCAL DISTRICT ASSESSMENT PROGRAM 

Carl D. Novak. 105 

MEASUREMENT PROBLEMS AND ISSUES REUTED TO 

APPLIED PERFORMANCE TESTING y 
James R. Sanders 121 

SYMPOSIUM ON: LARGE-SCALE ASSESSMENT REPORTING AND USAGE: 
DELAWARE AND GEORGIA AS EXAMPURS , 
Robert Bigelow and Hervey Scudder 125 ^ 

BIBLIOGRAPHY 132 



7 



THE STATE AGENCY AS A RESOURCE 
IN LOCAL NEEDS ASSESSMENT 

Paula T. Brictson 
Michigan Department of Education 



Mrs* Paula X« Brictson 
Research Consultant 
State Department of Education 
Box 420 

Lansing / Michigan 48902 



8 



THE STATE AGENCY AS A RESOURCE IN LOCAL NEEDS ASSESSMENT 

PAULA TISS BRICTSON 
MICHIGAN DEPARTMENT OF EDUCATION 

Abstract 

A study in 75 volunteer kindergarten classrooms^ designed by a 
state department of education^ allows participating teachers the option 
of using as many pf tl^^, s,t4t;^e assessment instruments as desired during 
the entire kindergarten year as one of four assessment moded* Teachers 
also can use commercial or teacher-made tests* observation, and other 
adults as information sources about student skill attainment* Each 
teacher is given a complete list of preprimary objectives* record- 
keeping fortns and copies of the state assessment instruments* Through 
the state agency as a resource center* a teacher can tailor an assess- 
ment program to address the needs and abilities of students within the 
framevork of her/his instructional program. 

IKTRODUCTION 

Several years ago* Michigan*s State Board of Education adopted a 
six step educational management system* The six steps of the system are: 
the identification of common goals* the development of perforinance ob- 
jectives* the assessment of educational needs* the analysis of delivery 
systems, the evaluation and testing of improvements in these systems* 
and the development of recommendations for educational improvement* 
CotTJnon goals for the state were approved in 1971* and a series of develop- 



mental and review procedures vere begun In order to develop statewide 
minimal performance objectives in the basic skills areas* By 1973» the 
State Board of Education had given final approval to several sets of per- 
formance objectives^ one of which was a set of performance objectives for 
preprltttary education* These objectives had been drafted by expert re- 
ferent groups throughout the state; and reviewed^ modified and approved 
by commissions composed of teachers^ curriculum spi^clallsts^ administrators 
and lay citizens* 

The Michigan Educational Assessment Program (HEAP) since January^ 
1970, through the testing off all 4th and 7th graders in basic reading and 
mathematics skills, has provided information which can contribute to the 
assessment of needs* The program has collected^ analyzed, and dissem*-" 
Inated information on district and school resources, student background, 
school and student academic performance in the basic skills, and school 
and district size* These data are useful in describing certain aspects 
of Michigan education and can be used by decision-makers at the school^ 
district* and state levels* 

In the school year 1973-74* the educational assessment program made 
a revolutionary change from norm-referenced to objective-referenced tests 
based on the several sets of performance objectives* A major change was 
also made in 1974-75 — namely, the lutroductlon of a statewide first 
grade pilot* which was continued with different objectives in 1975-76* 

FIRST GRADE EDUCATIOMAL ASSESSMENT PILOT 

The performance measures administered in the program are designed 
to test some of the skills of first graders in the affective* cognitive 

10 



any psycho^-notor domains* These skills are considered important for a 

child to attain before entering first grade* The complete set of Th e 

Tentative Objectives for Prepritnary Education in Michigan on which the 

HEAP tests are based is shown in APPENDIX A* 

The procedures for the development* validation and editing of the 

objective-referenced tests used in the 1974 and 1975 First Grade Assess- 

ment programs were described in detail in two reports: "Development 

and Validation of Objective-Referenced Test Instruments for Entry-Level 

First Grade Children***^ Briefly, educators from four Michigan school 

districts (Detroit, Gwinn, Pontiac and Waterford) wrote test items* 

These items were edited by American Institutes for Research and tried 

* 

out in the four school districts in two sets* Following each tryout, 
the items vere thoroughly reviewed and revised* 

The first grade co;nponent is different from the fourth and seventh 
grade components becuase of the unique requirements of administering 
objective-referenced tests to students of this age* The tests must be 
administered either individually or in small groups* This could require 
enormous amounts of teacher time unless restrictions are placed on the 
number of items used per objective, the number of objectives tested, and 
the type of data output expected* This problem was solved by gathering 
only enough data to yield reliable statewide results and by ^Limiting the 
number of objectives to be assessed in the program* 

In the first year of the first g/ade educational assessment program 
(1974-75), 44 individually- and group^administered tests were constructed 
to measure 48 cf the 134 preprimary objectives* Each test was taken by a 



American Institutes for Research, Palo Alto, California, August, 1974, 
and June, 1975* 

11 



statewide sar.ple of first graders* Approxlnately 2,500 teachers and 
77»000 students were Involved In the program* In 1975*^76, an additional 
32 small group- and Individually- ad ministered ob jectlver referenced' tests 
were administered to a statewide sample of 65,000 students* No student 
selected for the sample was tested with more than one test form* 

Teacher feedback elicited through a questionnaire enclosed iti the 
1974-75 test package Indicated that the majority of the behaviors dc* 
scribed In the preprlmary objectives had already been acquired by entering 
first graders* The statewide results for the first grade educational 
assessment program confirmed this teacher observation In that 75% or more 

of the students correctly answered every test Item for 29 of the 48 ob* 

* 

jectlves* In addition^ teachers* comments about the usefulness of the 
Information rnd needed Improvements were requested* Some comments were: 
1) the Information was useful and assessment of preprlmary objectives 
should continue; 2) some teachers suggested that since the majority o£ 
entering first grade students had acquired the described behaviors^ an 
educational assessment of preprlmary skills would be appropriate at the 
kindergarten level; 3) a further suggestion was to allow testing over a 
longer period of time than the three weeks of the regular progr:im* 

1975-76 MEAP KINDERGARTEN STUDY 

Objectives of the Study 

The MEAP kindergarten study has provided the Michigan Department of 
Education the opportunity to assist 75 volunteer teachers at the clas<rroom 
level in Implementing an educational needs assessment to aid In Instruc 
tlonal planning* Teachers were allowed during the period of September^ 

12 



1975» throush Aprils 1976» to: 1) select from the set of 132 state 
approved preprlmary objectives those important to her/his educational 
progratpj 2) assess student attainment of the objectives at an appropriate 
time in the teaching seqdence^ 3) choose among four assessment modes a 
preferred way to test student attainment^ and ^) maintain a record of 
Individual and group skill attainment* Thus a teacher can individually 
desiffi an assessment program to address the needs and abilities of stu- 
dents within the framework of the instructional program* 

The desired outcomes for the state agency were to ascertain 1) those 
preprimary educational objectives important to teachers of kindergarten 
children^ 2) the preferred as<%es|^ment modes for the numerous preprimary 
educational objectives » 3) the number of educational objectives which can 
be assessed during the school year» 4) teacher reaction to the provided 
test instruments^ and 5) teacher reaction to the assessment model pre-- 
scribed by the study. 

Methods 

The 1975-76 >EEAP kindergarten study evolved from the suggestions of 
teachers involved in the first year of the first grade pilot program* It 
seemed appropriate to conduct this study in kindergarten classrooms since 
many kindergarten teachers have built curricula based upon objectives 
similar to the set of preprimary objectives and are already assessing 
student attainment of many of thes* skills* Rather than limit the assess- 
ment of preprimary objectives to only the objectives tested in the test 
form» the Department asked that teachers focus on the entire set of pre- 
primary objectives*^ 1 o 



The Department suggested guidelines for this process and provided > 

* 

as one type of inaterlals^ the 75 HEAP assessment instruments for teachers 
to use as they felt were appropriate* Teachers could also use three 
other assessment tnodes; 1) other assessment instruments (commercial or 
teacher made tests) » 2) teacher observation^ 3) and other sources of 
information about a child's skill attainment, such as another teacher » 
or parent* The study extended until the end of Aprils 1976^ allowing 
teachers sufficient time to assess students on skill attainment at a rate 
which is compatible with each student's development* 

During the 1975 spring assessment briefings^ the Department staff 
requested volimteers for the kindergarten special study* These schools 
indicating an interest were^ invited to a briefing which was held in Lansings 
Michigan* Following this meetings district superintendents representing 
125 schools sent a letter of commitment to the Department* Schools were 
stratified according to size and geographical location* A total .of 75 
classrooms representing 37 districts and 70 schools were selected for this 
study* 

Each teacher' received the 75 state test- forms^ an assessment admin- 
istranion manual for each test» student booklets^ and any additional rc- 
quired .test materials (beads^ bean bags^ cassettes^ picture^ooks^ and so 
forth)* In addition^ each teachers was provided an explanatory manual 
describing the study^ a class roster (record-keeping form)» directions for 
recording results and teacher comment sheets* During on-site visits in 
September^ all participating teachers were instructed hy a Department 
staff member on the use of the materials and the parameters of the study* 

A class roster, (APPENDIX B) for recording attainment of objectives 
was designed for this study* A column for each of the preprimary ob- 

14 



jectives appeared on the roster- When a child attained an objective, the 
teacher was instructed to indicate under the column, and opposite the 
student*s name, the month the objective was attained and the assessment 
mode. 

Teachers were encouraged to assess as many of the preprimary objectives 
as possible, using a variety of assessment modes; they were not expected 
to use only the provided tests for a given objective. One of the desired 
outcomes of the study is to learn the variety of ways a teacher appraises 
skill attainment. For example, a teacher could test student attainment of 
an objective using the provided test with some of the class and test 
another portion of the class by teacher observation (a different assessment 
mode). In some cases it might'be suitable to use only one assessment mqde 
to measure attainment of an objective. 

When the student attains an objective, the teacher indicates on the 
provided class roster the date (only the month) and the assessment mode. 
The possible assessment modes are coded as A ^ MEAP test; B Other tests; 
C = Teacher Observcition; and D = Other, as explained below. 

Assessment bfode A. If a >fEAF test is used and the objective measured 
by that form is attained (using a df*signated criterion level for each test 
form) the teacher records a letter A and number indicating the month. 

Assessment Mode B. Some teachers have utilized other tests to assess 
student progress in specific skills. Examples of such tests are those used 
in local or state evaluation activities, commercial tests, district tests, 
or their own paper-pencil tests. This study gives the teacher the option 
of utilizing these tests at her/his discretion. The criterion level for 
attainment of each objective is determined by the test used. If this 
assessment mode is used and it indicates the student has attained the ob-* 



ERIC 



15 

-8- 



jective, the teacher is instructed to record the letter B and the nonth 
the teacher determines the skiLX is attained* 

Assessment Mode C* Teachers may» in the course of teachings observe 
students demonstrating attainment of some of the performance objectives* 
The purpose of providing this assessment mode is to allow teachers who 
observe attainment to note this and therefore to omit formal testing of 
these skills for the students observed* In some cases» teachers may de* 
sign a structured situation^ perhaps similar to that used in a more "formal^ 
test» in order to quickly assess students* 

Assessment Hbde D* if the teacher judges that a student has developed 
a specific skill through another assessment mode» such as a student inter- 
view or by talking with the student's parents or some technique other than 
MEAF tests» other te^ts» or teacher observation^ the teacher records the 
letter D and the month this determination is made* 

A copy of the class roster (the record of student attainment) was 
returned to the Department in May» 1976* Teachers can use their copy for 
possible use as a diagnostic report to first grade teachers* The rosters 
will be analyzed by Measurement Research^ Center/Westinghouse Learning Cor* 
pbration^ Iowa City» Iowa» to determine the number of objectives attained 
per month and the specific assessment mode used for each objective* 

Comment^ sheets were provided and teachers were invited to comment 
in five specific areas: 1) comments about the study as a helpful curriculum 
tool» 2) comments about the study as a facllitative means for assessing 
student progress in skill attairinent^ 3) comments about the entire set of 
tests^ 4) comments about specific test items^ and 5) comments about the 
recommended criterion levels* If a teacher chooses other tests as an ad- 
ministration oode» they were asked to describe the tests used for each ob- 
jective* 

16 



In addition to the Class Roster and teacher coiinient sheets, partici- 
pating teachers and principals received two mailed surveys which probe 
their reactions to the study as a viable assessment program for the kinder- 
garten level* Specifically teachers are asked if, having participated in 
the study and as a resulj: of the methodology of the study, they are in a 
better position to plan instructional programs based on the needs and 
strengths of children, and do they have more information about each child's 
progress* Principals are asked if, as a result of the study, are they in 
a position to make better judgements ab:}ut student placement and program 
planning* 

Conclusions 

While the final assessment results and surveys were only collected in 
late May and will not be analyzed before July, interchange at follow-up 
meetings with teachers in November and February has indicated that the 
'study is viewed as a positive and helpful service by the Michigan Depart- 
ment of Education to assist teachers in tailoring a local needs assessment* 

In response to the enthusiasm of teachers and principals, the Depart- 
ment will offer the kindergarten program on a volunteer basis in 1976-77* 
Originally this coming yearns project allowed for the inclusion of XQO 
elementary shcools* Due to the extensive requests to participate, the 
study will be conducted in 200 schools* The Department will provide each 
participating school a manual containing 1) test administration and scoring 
directions for each MEAP test, 2) individual and group record keeping forms, 
3) suggested observation techniques, and 4) example classroom activities to 
assess student skill attainment* In addition, a kit containing all hand- 
outs (cassettes, picture books, beads, geometric shapes, alphabet cards, 
and so forth) for administering MEAT tests will be included* A set of 

17 

-10- 



ditto masters of the student booklets will be provided* This should 
alleviate storage problems and permit the teacher the* option of dupli^ 
eating only those specific tests which he or she decides to use. Regional 
meetings will be help in August and September to brief participants on 
the details of the program. 



Implications of the Study 



It is critical for state^agencies, in addition to providing currlc-^ 
ulum and program guidance, to instruct and assist local education agencies 
in the design and use of assessment techniques for local assessment programs. 
Further, the state agency can serve ^ resource service by providing tests 
to measure a variety of curriculum objectives important to local educational 
programs*. The study is proving to be a viable model whereby a state agency 
can provide an assessment program design, tests, and record -keeping forms, 
and yet permit autonomy at the local level « 



18 



APPENDIX A 



AFi^^Cm'E OOJliCriVL-S FOR FRCPRiLilRY STUDEirrS 

A, EMOTIONAL DEHAVIOR 

By tht end of th* preprFrntry experience^ students should be abfe to domonstrete the following 
behcvlors as measured by teacher observatFon and/or objective referenced instruments: 

1. Recognrzi at reast three of fi^e tmtc emotrons (fear, anger, sadness^ Joy. fbve) in self and 
others; 

2. Recognize some baste causes of famirfar emotional resporvseS (e.g., sad, happy, angry, etcj; 
3* Begfn to show empatfty tor and awareness of the feelings, needs* and desires of others; 
4. Actively express feelings nortvsrbally; 

^ & A greater ability to verbalize affective experiences (e^g., posiUve and negative feelings, 
wants^ vatues> confGcts, eta); 
ft* Display an increased repertoire of behavioral responses by which to solve affective problems 
(eg., create their own solutions; seek help from parents, teachers, and others; give help to 
other children; etc); 

7. Given situations in which gratification must be delayed* will demonstrate Increased ability to 
accept imposed delay ai^d to regulate t>ehavlor appropriately. 

SELF CONCEPT 

By the end of the preprimary experience, students should be able to demonstrate the follbwtng 
behaviors as measured by teacher observation and/or objective referenced Instruments: 

1. An increase in positive seJf*image; 

2. Given role-ptaying and real*tife situations, will demonstrate an Increased awareness of their 
relationship to their family and to the wider community and environment; 

3. Given role-playing and reaMlfe situations, will demonstrate an Increased awareness of raciai 
and cultural slmtlaftt^s and differences; 

4. An increased undersfanding of the concept of sexuality (I.e.. recognize their sexual 
tdentification; are comfortable with own sexuality and the sexuality of others); 

5. Given role-pl^lng and real*l]fe situations, will demonstrate a healthy, self-respecting attitude 
towards their bodies and its simple physiological functions; 

ft* Given various roles to play {such as occupational, parental, emotional, cultural, or 
situationaO will demonstrate awareness and sensitivity for these coles. 

a SOCIAL RELATIONSHIPS 

By the end of the preprimary experience, students should be able to demonstrate the following 
behaviors as measured by teacher observation and/or objective referenced instruments: 

1. Widen peer and adult relationships by demonstrating Increased ability to play with one or 
mora children and to relate to a targer group: 

1.1 An increased capacity to cope with strange and/or new surroundings and with familiar 

and unfamiliar people; 
1^ An increased at^llity to seek help from others when needed and when appropriate; 

2. Begin developing social interdependence by exhibiting an increased awareness of the 
Importance of glve^and-'take In social and work relationships; 

2.1 Exhibit evidence that they are accepting of differences in others; 
2:2 Demonstrate their ability to listen to others; 
23 Exhibit the quality of sharing with others; 

2*4 Demonstrate that they have learned to ask permission to use objects belonging to 
another person; 

2^ Demonstrate that they can recognize cause and effect in the behavior of others, and f^* 
effects of their behaWor on others; 



19 



-12- 



Z6 Exhibit greil^ participation in activitfes and In communfcation with othars; 

3. Idantity stvarat workers^ from diffarant occupational lueas in^ tha community and tati 
aomtthing about thafr work: 

4. Nama aoma^of tha paopla childran laam from and what thay le|xn from them: 
Particl0ata In daclslon-making sltuatiofts (a^» maka parsonal or group rules for ctaasroom 
bahavtof, atc^. 

0. eCKAVIOlUt RESPONSE TO CLASSROOM ENVtRONMENT 

By the and of the praprimary experience, studenta shoutd be able to demonstrate the following 
baliiviora aa measured by teacher observation and/or objective referenced Instruments: 
1« Wil^gneas to accept reasonabte limits set upon behavior, play space> use of materlats.or 
the 4pe of ectivitlea in which engaged; 

2. Acttptance of routines ta.g.. deity schedules, room arfangamentti aduUs> etc) and changes 
in routlnat; 

3. Cooperation and Independence (without help or demonatration) In following verbal 
directiofia for thrao or more sequential Instructions; 

4. Increased Indepertdettce in the*areas of personal hygiene* eailng« and dresslr^g; 
& IncrMsed ability to independently begin, work through, and continue an activity; 

(. lncr«astd ability to aceapi responsibility for the use and care of their portion of the 
ctesaroom environment 

PSYCHOMTOR OBJECTIVES FOR PREPRIR1ARY STUDENTS 

A, GROSS MOTOR BEHAVIOR 

By the end of the praprimary experience, students should be able to demonstrate the following 
behaviors as measured by teecher obaerva^n and/or objective referenced instruments; 

1. Balance while walking (e.g.. will be able to walk st least ten feet on e straight three-inch 
taped line without stepping completely off the line with either foot); 

2, Balance while running (e^. will be able to run to a target placed no more than twenty feet 
away wlthbut stopping or veering off a path approximately fivr 'feet wide): ^ 

3. Muscle coordlnatkin {a>g.. will be able to Jump with both feet rising together over a 
three*lnch taped line): 

4, Muscle coordination and balar^^e (e,g>. will be able to hoplhree consecutive timet using one 

foot): 

& Eye-foot muscte coordination and balance (ag.« will be able to (dck a ^nnch ball without 
toeing hia balance or falling): 

Eye-hand coordination (a.g.* given a bushel basket tnted toward him at a 45*dagree angle 
and placed four feet In front of him, the chltd will throw a bean bag Into the basket); 
7. Touch or move parts of tha body {e.g>» head. arms, elbows* hat is. legs> knees, fee^ called 
for by the teachen 

6> Free body movement by physically responding to music, song. 'rhythm, and/or rhymes; 
9^ l3g eoordbuition (o.g^* will be able to skip or galtop* leading wfth the preferred foot). 

B. FtNE MOTOR BEHAVIOR 

By the end of the preprimary experience* students should be able to demonstrate the following 
bahaviofs as measured by teacher obse^vetton and/or objective referenced instruments: 
t Digital coordination (a.^.» place a three quarter inch button through a one^inch button hole): 
2, Digital coordination {a^>i by taeing abt.i to place ten small one*half inch beads on a lacing 
strlngjf 



20 



3. Eye*hand coordination (rg., given i ten*mfnute time limit, will bt abte to put tosether a 
simple purxle of five to eight pieces); 

4. Tlnimbhfihger coordination {e*g:, given « peli; of child's acbsor? and e «tftp of one^lnch by 
elx^nch eofistruetion paper, can make clean cuts throe times In five attemp:s without folding 
or tearing the paper); " 

5. Ey**hand coordination (eTg.« given « large crayon and at least a two*lnch model of a circle, 
yAM be able to copy the model In such a manner that the curved line closes); 

6. Eye*hand coordination and lateral movement (t.g*. given a large crayon and ftt least a 
two*lnch model of two Intersecting lines, will be f^ble to copy the lines so that tbey Intersect 
in some manner): 

7. Improved eyt*hand coordination {e.g.. given mathr,-ials such as Interlocking blocks or other 
aviltabfe small blocks, wtll be able to build a stable elght*pteee vertical structure or design). 

COGNITIVE OBJECTIVES FOR PREPFlia^ARY STUDENTS 

A. LANGUAGE DEVELOPMENT^ / 

By the end of the preprlmary expMence, students shoukf be able to demonstrate the following 
behaviors as measured by teacher observation and/or objective referenced Instruments: 

1. En^ment In looking at books and listening to stories: 

2. Produce pictures and/or scribbles of own creation which are used as a basfs for 
communication; - - 

3. USten and react to another's oral language; 

4. Given an ora! story which expresses a mood (e*g.> happy« sad* angry, afraid), will Identify the 
chareclerlsllc mood of the story; ' 

S Given ar^ oral stimulus requiring a specific bodily response (e.g.« the game **Slmon Says'*}, 

will r ^ de the appropriate response: / ; 

Talk about a picture or a group of two or three rrSted pictures; 
7. Tell about personal experiences; 

a Distinguish er)vironmental sounds they hear (e.9.* traffic sounds* dog t»rking, baby crying. 

etc*); ' , * 

9. Given three sin^ta syllable sounds, two of which rhyme. w!tl select the two which rhyme: 

10. Express an idea or ask a question orally of another person (e.g., explaining how a toy 
works/asking how a toy works): 

11. Given a small group situation, will share own Ideas and listen to the Ideas of others: 

12. Talk ^bout the feelings associated with events; 

13. Non*verba% Imitate or role^pl^ the simple action of people or annuals: 

14. Name likenesses and differences in pictures, objects, and shapes; 
IS Recognize some letters of the alphabet; 

16. Given a sequence of pirtu**es portraying a^^tory* will tell about th^f stoiy by responding 
appropriately to each picture; - ^ ^ 

17. Print first name correctly: 

18. Recognize first name. 

a. cussincATioN and ordering 

By the end of the preprimary experience, students should bo able to demonstrate the following 
behaviors as measured by teacher obsen^atlon and/or objective referenced Instruments; 

1 Given two kinds of objects in a targe set (e.g.* elbow and shell macaroni or bottle caps and 
checkers), will sort the objects into tvi^ sets according to their separate characteristics: 

2. Given an object of a specific color, will pick an obje/t which is of the same color; 

21 



-Id- 



3. Group ittms on the &as:> of common function things to eit with* things to v.ftar, thirds 
to pray with* •!&): 

< Group Hems on tl^e bests of assoclslion (e.g,, hammer end'nelt* shoe and foot* etc.): 

5. .tcfontjiy end group itsms on the bssis of genvar clissea or categories (such sf furniture^ 
inlmats* ptsnts. etc.): 

6. Given items of common qualities te.g.« fexture* weight, loudness* spe&d* temperature* cotor)* 
will group and match Items on the basis of these quatities and be expected to know and use 
at least two of the compa.*atlve terms (0.9,^ soft-hard* roud^uieU fast^slov^;, smooth-rough* 
hot*cotd* da/k*llght, heavy-light) to identify the groupings; 

7* Given a pattern using objects of two or more cotors, will duplicate the pattern selecting from 

a set of similar objects; 
& Gtvan a set of ten objects of assorted color and shape* wilt pick out objects having specific 

eomblnauons of the two attributes; 
9, Given one series of three objects arranged In a pattern by color or shape and the first object 

of the second series, will complete the second pattern s^eries; 
10. Gtvw a variety of objects, will group some of the objects into a classification system 

according to their own perceptions* 

C* NUMBER NUMERATION 

By the end of the preprimary experience* students should be able to demonstrate the foltowing 
behaviors as measured by teacher observation and/or objective refereiiceo instruments; 
1* Given a set of coins of a penny, nickel, dime, will pic^ and name each one; 

2. G.^en a cotlection of five objects of varying lengths, will pick up the /ongesf or the ^f/wrfesf 
as requested* 

3. Given a set of five pictures of objects of various heights, will arrange the pictures so that the 
objects are ordered from shortest to uilest; 

< Given two objects of decidedly different weights, will hand to the teacher the one that is 
heavy or the one that is light as requested: 

S Given the directior? "count to ten", will recite the number names from one tiirough ten In 
the usual order; 

6. Given an oral description of e set and a collection of objects, some of which belong to the 
set and some of which do not, will pick up the objects that are members of the given set; 

7. Given cutout pictures of any two sets (from one to five members), will place the pictures of 
the sets in order* from that set with fess members to that set with more members; then, will 
order the set pictures from more to less; 

ql Given numeral cards 1 througt) 5 and five sets of objects consisting of one, two* three* four 

and five members, will place ^e sets in sequential order from the set with fewest to the set 

with the most and then wilt place the numeral cards in front of the set heving the number of 

members named by the numeral: 
9. Given a set of objects vvith 1*i members^ will count the members of the set and state the 

cardinal number of that set; 
10. Given pictures of sets with 0>9 objects end number cards from 0>9 (using felt numerals* 

sandpaper numerals)* will match the right numeral with the picture uf Uie set having the 

same number of members; 
11* Given dot pattern cards showing f«ts of 0>10 dots, will count while pointing to the 

appropriate dot card: 

12. Given a set of 2 to 8 objects, the students* from his own group of more than 8 objects will 
construct a set having more members than the on'ginai set; 

13. Giv£n a set <?f 2 to 8 objects* the students, from his own group of objects will construct a set 
having fewer members than the original set; 

22 



-15- 



14* Given an csscrtmant ot cutout shapes fnctudinQ SQuarast triangles, rtctingtes and circles of 
various sizes rancfomty arranged, will select e given shape as ret^uesttd. 

SPATIAL AEUTtONS 

8y the end of the preprtmaiy exportence, students should be ab(* to docronstrete the foUowing 
behaviors as meaiurid by teacher observatfon and/or objeetive referancid Instruments; 
It IdenlHy and name the following parts of his body: head, arms, hands« torso* tegs and feet; 
'2. Knowledge of concepts of position (such as on^off, overninder, on top of, In^out* fntOH)ut of^ 

top-bottom, above*belowt In front of*fn back oft behind* besTde-next to, by« between); 
3. Knowledge of concepts of dtreetfon (such as up*down« around*through* forward*backward, 

to>from, sideways, across); 
4* Knowledge of concepts of distance (such as neaMaft dose lo>far from)* 

& TEMPORAL AEUTTONS 

By the end of the preprimaiy experlencOt students should 6e able to demonstrate the following 
behaviors as measured by teacher observatfon and/or objecHve-referenced Instruments: 

1* AblRty to follow temporal commands (such as go* stop* et the same time* now^ start* finish); 

2. Understanding of time Intervals (such as beglnntng^nd^ fast*slow)* 

NATURAL SCIEHCSS 

By the end ot the preprlmary experience* students should be able to demonstrate the (ollo^/ng 
behaviors as measured by teacher observation and/or objective referenced lnstrt*ments! 
1* Given o^ecls of various primary colors (red* blue vid yellow)^ will be ^b(e to correctly 
kfentKy the colors: 

2. Given an object to examine using their senses of sight* sound* smel'. ta^tSt end touch* wflf 

exhibit the ability to describe certain oharacterlsttcs (such as sire; color^ werght* texture* 

temperature^ odor* etc): 
3* Given an object (or picture of an object)* will describe verbally by reaming at least two 

characteristics of the object (e.g.* given a rubber bait* the student will give two of the 

properties* such as color* shape (round)^ density (light)*. elasticity (bouncy)* size (smaller 

than my hand)^ temperature (cool)* texture (smooth); 
4* Given e set of objects or events* will arrange them in sec^uence In accordance with 

prescribed criteria (e.g.* given separate pictures of a dog and ^ puppy or a flower and some 

seedSt the student will arrange them In proper order); 

Given an object or picture which changes with successes observations* will state at least 
one of the properties which |$ changing (e.g.t the student tastes a sample of unbaked cookie 
dough and a sample of a cookie made from the same dough and describes what changed In 
the baking (hardness* textcre* color* taste* smell); 
6. Given e macniiying glass and an object or organism with some characteristic not visible 
without a lens, can observe the object or specimen with the lens and identiiy et least one of 
the characteristics; 

7* Given e picture or oroup of pictures showing items which comprise both live and non*l1ve 
things* can point to examples of living and non*living things. 

G. SAFCTY 

By the end of the preprlmary experiencdv students should be able to demoa^traie the following 
behaviors as measured by teacher ot>servacion and/or objective referenced instruments: 
1* Awareness of ccmm&i hazards encountered in daily Tivlng (e.g.t toxic hotAsehoid chemicals 
or substances* electricity* toxic plants, explosive and combustible substmces* etc); 



23 



-16- 



2. Adhere to safety fuTes lt% tho home, to and from school* and in tht schooh 

3. Perform safciy as pedestrfans. as psssangers in motor vehicfes. anr' as tricycia operators. 

H. FINE ARTS 
Aft: 

Tht |oy in creativity should ba emphasized throughout att fine arts instruction. The process !s 
mora Important than the product. By the end of the preprimary exparience. students should be 
able to demonstrate the following behaviors as measured by teacher observation and/or 
objecttve referenced instruments: 

1 (Measure and enjoyment 1^ a variety ot art experiences; 

2. Use a variety of mtdti (such as paint, crayons* finger paint felt markers, etc.); 

3. Create two* and three*dimensional forms using a variety of manipulative materials (such as 
ctay> paper*mache. blocks* etc^; 

4. Recognize cotor In the natural enWronment and tn the man*made environment; 

5. Use a variety of cotor In the production of art; 

6. Recognize that lines define space (e.g.. usesjine in a variety of ways to express length, size, 
or shape); 

7* Recognize tht direction of tin^ {3i.g.. down, slanted, over, across* etc.); 

8. tdentily the characi eristics of tine (e.g.« fat, thin, winding, citmbtng. etc); 

9. Use a variety of ll^es in his art activities; 

10. Distinguish between two* and three*dimensiona1 forms; 

11. Develop compositions using size, shape, direction, overlapping shapes and/or repetition: 

12. Use a combination of various textures in art forms; 

13. Recognize differences in his art work (e.g., size, surface, parts of objects, shape, texture, 
etc^; 

14. Use fiat, curved and irregular surfaces in producing three-dimensional forms. 
Music: 

By the end of the preprimary experience, fi^udents shoufd be able to demonstrate the followtng 
behaviors as measured by teacher observation and/or obiectlve referenced instruments: 

1. Create music on a variety of classroom instruments; * 

2. Freely express the mood of music through body movement; 

Through physical movements {e.g.. clap, march, walk. run. play rhythm instrument) 
demonstrate his ability to respond rhythmically to pulse or beat in music: 

4. Repeat a very simple rhythm, individually or in a group (e.g.« singing, chanting, speaking, 
clapping, using rhythm Instruments): 

5. Participate with a group in singing simple, familiar melodies; 

6. Upon hearing mustc. will recognize whether a melody moves up or down; 
7* Upon hearing music, will recognize fast and stow tempos: 

5. Distinguish between long and short tones. 

I AEf'^ierjc APPReciAimN 

By the end of the preprimary experience, students should be able to demonstrate the following 
behaviors as measured by teacher observation «nd/or objective referenced instruments: 

1. Begin to develop aesthetic apprscistfon by responding emotionally, through ton*directed 
spontaneous self-expression (drawing, painting, movement. self*report). to moods and 
feelings in art. music, movement, drama, poetry, prose and nature; 

2. Begin to recognize the beauty or aesthetic Qualities of his own work as weu as the work of 
others; 



2U 



ERLC 



0 

-17- 



3. Vatut h1i art exptr:*rice tn^t comfonabte with art tctMtits, willinQfy ptrUeipit^ M art 
KtMtte$^ txprmes parsonit iitiifiction with art tctivitioi* voiuntarity iticts to reptat tht 
art txptrltnces* demonstrates pride in art wv^rk, txpresset hfmstlt through color, aiej; 

4* Ourlog an art aetivity* will voiuntarity ust a variety of patttms and both two* and 
thf«OHlim«nsronal forms; 

5. Indfcatt a preftrtnea for eartain taxturts In tha dalty art axpcrfanca; 

^ fteact to musical axperianea by voluntarily rtspondins In out-of*schoot situations (e^g.. 
diseussfs muffe dass happenlnss, sings sonss taamod at school^ chooses to listen to music 
pioorans on radio, taltviston* ttcj: 

7* fteact to ntuttcat axparianca vduntarEly responding durthg school (a,g.« expresses a 
reaction whan it Is time for music* Joins In quickly, freely, or slowly whan musical actMHef 
begta, expraases reactions to the music cUss during dasstlme or when It has ended* brings 
a favorite record to achooK seeks opportunities to play classroom instruments, eta). 



25 



CLASS ROSTER 



B • Othtr Tvstt 

C ■ Teacher ObK-rvatiqn 

U- Other 



Teschff. 
School - 
Di»lrict- 



AFFECT1\'E DOMAIN 



Etttotionst Behavior 



Stlt Concept 



SocM 
KtUtionihtp* 




ERIC 



26 



ESTABLISHING CRITERION LEVELS FOR 
JUDGING THE ACCEPTABILITY OF ASSESSMEHT RESULTS 

Iris Weiss and Larry Conaway 

Larry Conaway, Presenter 
Research Triangle Institute 



Mrs. Iris R. Weiss .... 
Education Research Scientist 
Research Triangle Institute 

PO Box 12194 , « ^ M r 27709 
Research Triangle Park, N.C. 2//uy 

Mr. Larry Conaway c.-^*»«+^<n- 
^nior Educational Research Scientist 

Research Tri'ingle Institute 

PO Box 12194 , « ^ M r 27709 
Research Triangle Park, N.G. 2//ui? 



27 

-21- 



ESTABLISHING CRITERION UVELS FOR 
JUDGING THE ACCEPTABILITY OF ASSESSMENT RESULTS 

IRIS WEISS and URRY CONAWAY 
RE5EARCH TRAINGLE INSTITUTE 

" I* imoDOcnoH 

A* The Problem 

One approach for judging the acceptabtlity of student perfomtance has 
been to cotopare state and local assessment results to national and regional 
results reported the National Assessment of Educational Progress* These 
types of comparisons have provided useful Information, for identifying possible 
Strengths and weaknesses In curriculum and instruction at the state and local 
levels* However » educators vho have been involved In Interpreting these 
results have Indicated a need ^f or additional types of criteria* The fact that 
the state perfonaance levels 'on certain items were significantly below the 
national performance levels does not necessarily mean that these are areas 
of weakness In the staters program* It m::y be that these items are poorly 
constructed items' or tbat chey are measuring skills which are considered to 
be of low priority In tbat state. Siaillarly» if the Nation's students per- 
formed poorly In a particular area» surpassing the Nation is not necessarily 
an Indication of strength* A state can do better than the Nation but still 
do a poor job. It was obvious that these types of statistical comparisons 
of performance would not suffice; professional judgment would have to be brought 
to bear in identifying strengths and weaknesses. 

Several states and local districts have attempted to obtain this profes- 
sional judgment by Involving edu:ators In interpreting assessment results. 
They have brought together groups of educators to study the assessment results^ 
to determine areas where performance was particularly weak» and to make 
recommendations for change. tJnfortunately» the criteria which they used to 
identify strengths and weaknesses were often quite vague. With the actxial 
results In front of them» educators found it difficult to think about what 
they really would have liked performance to be. There was a tendency to con-* 
sider areas with low p-values as weak and those with high p-values as strong 
without regard to the Inherent difficulty of the items used to assess these 
areas. There was also a tendency to **make excuses" for poor performances » to 

28 



judge 4 p«rform&ttC« l«vel acceptable because It was "about the best you 
could expect considering the level of present Instruction." The idea that 
Instruction might therefore need changing sometimes did not emerge. 

There was a need for procedures to establish £ priori standards with 
Which educators could coorpare student performance to assist them In judging 
the acceptsblllty of assessment resists. This paper descrlhes procedures 
vhlch have been developed to estahllsh i priori standards and discusses some 
o£ the uses and llmltstlons of these procedures. 

B. The Literature 

The rerlev of the related literature provides only a little guidance In 
establishing practical procedures for setting i priori performance standards. 
Airaslan and Madaus (1972) stated that the area of setting' standards was the 
area of criterion referenced oeasuremrat In most need of research* and Quirk 
(1974) discussed a similar lack of research In the context of performance 
based teacher education. 

TWo very recrat articles contended that the lack of research In this area 
still exists today, lieskauskas (1976) reviewed procedures suggested for setting 
the pass*fall point and concluded that only one method has received extensive 
practical use. This method vas developed by Nedelsky In 1954 and Involves 
teacher judgment of the dlstractors which minimally passing students will 
be able to identify as Incorrect* This model has been used extensively in 
medical education^ and literature is av£jJLable on several applications including 
a recent one by Smllansky and Guerln (1976) . 

Jaeger (1976» p. 13} also reviewed a number of standard setting procedures* 

and he summarized the present situation as follows: 

The research that exists on standardise ttii:g procedures In competency* 
based education appears to be largely theoretical .... Throughout this 
paper* I have Identified questions for which research-based answers are 
apparently unavailable. Application of statistical models and theo* 
retlcal formulation are unlikely to provide answers to these questions*. 
There Is need instead for empirical investigation Involving hwoan 
standardise tters In real or simulated judgmental situations » using real 
performance data and real descriptions of task domains. 



29 



-23- 



After « d«cl8loa was made to de^relop a aew set of procedures for 
setting i priori standards, the literature did provide some guldaace la 
deciding to have educators set standards for individual Items rather 
than objectives. HiUmau (1973) stated that It is difficult to defend 
the frequent practice of employing a particular passing score only on 
the grounds of tradition* Quirk (197^^%^ 317) was very specific In his 
discussion of fticed cutoff scores In performance based teacher education: 

While [fixed cutoff scores] sound sanisclentlf Ic , they do not 
possess much substantive value. The percentage of Itaas related to 
an objective which a candidate answers correctly Is a function not 
only of the content of the Items, but also of the difficulty of the 
items* An estimate of the difficulty of the items can be obtained 
either from a logical judgment based on a study of the specific 
Itans or from airpirical ltem*dnalysls data* 

After discussing the state of Research In standard -setting, the state^ 

of-the-art In domain referenced testing, and the problems associated 

with learning hierarchies, Airaslan and Madaus (1972) concluded that 

teachers will have to establish their own standards using ejcpert opinion, 

experience, face validity of items, and group consensus* 

C* Development and Use of the Procedures 

The procedures were first developed In early 1974 by staff members of the 
Hesearch Triangle Institute, the Minnesota Department of Education, and the 
University of Minnesota* They were first used In conjtinction with the Hinne* 
sota statewide assessment of reading and the Maine statewide assessment of 
reading by surveying a sample of teachers with mallout questionnaires* Con* 
sensus procedures were then developed, and these have been used by Hesearch 
Triangle Institute staff In the Richfield (Minnesota) local school district, 
the Guilford CConnectlcut) local school district, and th& Maine and Washington 
statewide assessments. Similar procedures have also been used by others In ^ 
conjunction with the Oregon statewide assessment, and the Minnesota Educational 
Assessment Program has continued to use these procedures* 

Those who have used these procedures have not attempted to present 
or use the i priori criterion standards as absolutes* Rather, the standards 
have been presented as carefully considered professional judgments* They 
have been used along with other Information — normative data and individual 
item results-^to assist those who interpret assessment results in judging 
the acceptability of these results* 

30 



-24- 



IX. OBTiOTOG K PRIORI ESTIMATES OF MINIMAL ACCEPTABLE » 
DESIRED » AND PREDICTED PERFORMANCE LEVELS 

A. Daflnlclons 

The procedures developed for establishing criterion levels Involved 
having educators examine Individual adseasmeat Iteme and estimate n lnl iaal» 
deairedt and predicted performance levels. The definitions were specifically 
adapted for each assessmeat» but they have been Senerally defined 'as follovst 

1. Minimal Acceptable Outcome The percent of students you believe 
must be able to respond correctly to a particular item In order 
for you to consider Instruction to be providing eesential skills 
to these students. 

2. Desired Outcome • The percent of students you believe should be 
able to respond correctly to a particular item. 

3. predicted Outcome - The percent of students yov believe vill 
respond correctly to e particular Item. 

B. Statewide Samples ^ 

In the two earliest studies statewide samples of third and fourth grade 
teachers In Minnesota and M^lne were asked to make these astlnates for 
reading Items administered to 9*-year*olds using mallout questionnaires. 
The teachers' responses to these mallout questionnaires were averaged to 
provide statewide minimal acceptable* desired and predicted performance 
levels for each Item. Actual student performance levels were then 
compared to each of these levels. In addition* average student performance 
levels on groups of items In an objective were compared to teacher 
expectations averaged over these same Items. Items or objectives which 
had actual performance above the desired level could be considered 
strengths* and those which were below the minimal acceptable levels 
could be considered weaknesses. 

Three Independent samples of teachers were Involved In each study; 
the teachers* estimates for each item were relatively stable across the 
three samples* Indicating basic reliability of the estimates obtained 
using the Instrument. The results also "Showed that the teachers* estimates 
of predicted performance were generally quite close to actual student 



31 



performaQce; they were within IS percentage points on 77% o£ the Items 
in one study and on 737 o£ the iteas In the other study* When averaged 
across items In an objective, the teachers* predictions were extremely 
close to actual student performaoce* The procedures and results o£ 
these studies are presented in detail in a paper by Elliott (1974) * 

At first glance these results seem to indicate that teachers are 
rsnarkably good predictors o£ student performance* However » using 
averages o£ teacher responses conceals the fact that for some items 
there was marked variability in teacher estimates* For example^ the 
teachers* predictions for one item averaged to 62*4Z» which was extremely 
close to the actual student performance of 59*S%* However » nearly one<^ 
fourth of the teachers predicted that at least 80% of the students would 
answer this item correctly, while more than 10% of the teachers predicted 
p-values of 40Z or less* Similarly^ when obtaining the average prediction 
for an objective* the fact that the teachers overpredicted on some items 
and underpredicted on others resulted in predicted average values which 
were very close to actual average performance values* Clearly, these 
averages do not indicate whether the teachers are good diagnosticians* 

Using averages obtained by mailout survey techniques to determine 
minimal acceptable and desired criterion levels causes similar problems* 
For example* in an extreme case, half of the teachers may think an item 
is inappropriate for all but the brightest 9-year-olds and may therefore 
set the minimal acceptable level at 10%; the other half of the teachers 
may think the item is vitally important for 9*year-olds and may set the 
mlnlxQal acceptable level at 90%* Using the average value (50%) as the 
minimal acceptable criterion level does not really reflect the teachers^ 
judgments* 

The analysis problems involved in using averages raised the question 
of whether some other indication of central tendency such as median or 
modal response should be used in determining minimal acceptable and 
desired performance levels* More importantly, the various types of 
important considerations causing great diversity of opinion on some 
items pointed out the need for groups of educators to communicate about 



ERIC 



32 



•26- 




chft obJectlvM of Instruction In a given subject area* ?or this reason* 
and because using large sanples of teachers was quite expensive* consen* 
sue procedures vere adopted so that relatively small groups of educators 
would be able to meet together to establish criterion levels* 

C* Coaaensua Procedures 

The first study utilizing consensus procedures with e group of 
educators vas conducted in Richfield* Minnesota* e school dletrict 
which was conducting e locel assessnent in conjunction with the tUnnesota 
statewide reading assessment* A coomittee of educators representing a 
wide range of classroom situations fron remedial' to advanced ability 
groups met to discuss the essassaent items and to reach consensus on 
minimal accepteble* desired* and predicted performance levels* 

The consensus procedures yielded more extreme estimates than did 
averaging responses of a statewide sample of teachers* To; example* for 
identical items administered to 9-^year*oIds the predicted levele establlshctd 
by Richfield teachers using consensus procedures ranged from 20Z to 96Z 
while the statewide estimates ranged only from U7X to 75Z* The differences 
in ranges vere similar for minimal acceptable and desired performance 
levels. 

Results are available for comparing predicted levels established by 
consensus procedures with student performance from three studies* The 
Richfield* Minnesota assessment of reading was conducted at ages 9 and 
13; the Maine assessment of mathematics was conducted at ages 13 and l7t 
and the Guilford* Connecticut assessment of science vas conducted at 
age 17* These results are shown Instable 1* 



ERIC 



33 



-27- 



Table 1 

PERCEHX OF ITEMS FOR WHICH THE CONSENSUS ?R£DXCTED 
PERFOEMANCE LEVEL WAS WXTHIH 15 PERCENT OF THE 
ACTUAL STUDENT PERFORMANCE LEVEL 





Rlcbfield 
Age 9 Age 13 


Maine 

Age 13 AtA "^7 


Guilford 
Age 17 


OZ - 5Z 


3U 


36Z 


27Z 


23Z 


23Z 


61 - 101 


20Z 


221 


201 


20Z 


16Z 


11% - 15Z 


19Z 


13Z 


lU 


21Z. 


16Z 


TOTAL 


70Z 


71% 


58Z 


64Z 


55Z 



In gea«ral» the coasensus groups have been reasonably accurate In their 
predictions of student performance* As can be seen In Table 1» across the 
five consensus groups 6be percent of items for which predicted perfonnance 
levels were within 15 percent of actual student performance ranged from 55Z 
to 71% • 

The teachers* estimates o2 minimal acceptable acd desired performance 
levels have been used to detatmine i priori classifications of strength or 
weakness In objectives within the subject aJea* The average student perfor** 
mance level across the items in an objective is compared to the average minimal 
acceptable performance level and the average desired level across the Items* 
While the cutoff points have varied somewhat for different studies* student 
performance above the desired level has generally been defined as a strength 
for an objective* and student performance below the minimal acceptable level 
has generally been defined as a weakness* 

The locally developed criterion levels have been used to assist In judging 
the acceptability of student performance* The procedures used In obtaining 
these estimates have yielded beneficial side effects as veil; they have 
provided a vehicle for establishing communication among teachers within 
a school and across schools within a school district* The discussions 
have focused attention on educational objectives and student capabilities* 
and they oftra result in open debate about what should be taught compared 



ERIC 



34 



■28- 



.to vbat. actually is taught* Criterioa lavelg dcvalopad through tha usa 
of coatensus procedures for statewide assasameats have proved tlAllarly 
utaful in establishing comnunicatloa aoumg teachers » administrators* and 
university and state dspartownt of education personnel. 

III. INTER?R£TIN6 ASSESSMENT RESULTS 



A. Soma Problems in InterpretinS A Priori Criterion Levels 

Ths discussion thtts far has focused on procedures for establishing 
criterion levels prior to examining assassiunt results « Soma of the 
advantages of these i priori criteria have been pointed out * However 
these procedures also ha:va some serious limitations « Many teachers vbo 
have been Involved in establishing statewide criterion levels have 
indicated that they felt uneasy making statewide estimates; they would 
have felt much more cdmfortable setting minimal acceptably desired* and 
predicted performance levels for their own classes « The same problem 
exists in local districts* where some teachers have very slow classes 
and others teach only the most gifted .children* but the extent of the . 
^^roblem is considerably smaller at the district level « 

An additional limitation of these procedures is due to the fact 
that estimates are based on the percent of students who answer each Item 
* irrectly* The relative frequency of certain types of errors do not 
enter Into the establishment of these & priori criterion levels « This 
is a serious weakness since it is likely that error patterns would 
affect judgments about the acceptability of student performance « Consider 
the following hypothetical situation: The minimal acceptable level for 
an open-ended mathematics word problem item vas set at 60Z; only 40X of 
the students were correct* witich may Indicate that performance was weak* 
However* another 35X of the students set up the problem correctly but 
made minor computational errors « Thus a total of 75X of the students 
danonstrated that they knew how to go about finding the correct answer* 
and a group of educators might well judge chis performance to be satisfactory. 



ERIC 



35 

•29- 



As another example of error patterns affecting judgaients of accept- 
ability consider the following hypothetical example: The desired perfor- 
mance level for an Item Involving metric measurement was set at 70% » and 
69X of the students were correct^ which would Indicate that performance 
was satisfactory* However^ many students chose a response alternative 
vhich Indicated that meters are used to measure volume* Mauxy mathematics 
educators might consider the frequency of this error to show an area of 
weakness even though the percent correct was at about the desir^ level* 

' A related problem Is the fact that some prior expectatlons^are 
based on rather Inaccurate judgments about the difficulty of certain 
Items* For example^ at first Inspection the following It^ seemed to 
be quite straight forward « and a committee of mathematics educators 
predicted that 50% of Maine 17-year-olds wuld choose the correct answer* 
The actual performance (32% correct) was well below both the minimal 
acceptable level (60% correct) and the desired level (75% correct) which 
would seem to Indicate a weakness* 

A houicwift will par tht lowttt prtct ptt ounct fot net if sht buy> it « tht ston which 
offcn 

I 

12ounctf for40ctrui. 
6.6 ^ U ouncisfor4$ ctnU. 
32 . 1 mm I pouiidr 12 Ouncn for 8$ ctnti. 
47 .a S 2pOundiforddc«nti. 

The assessment results showed that a very large number of students 
chose foil 4 rather than the correct response* The committee examined 
the Item closely and speculated that there were two major rea^^ons for 
the performance on this Item* Firsts choice 4 vas the largest package 
slze« and larger sizes typically have smaller costs per ounce* Second « 
the correct answer had a price of 3*04 cents per ounce while choice 4 
had a price of 3*09 cents per ounce* To answer the Item correctly a 

36 



-30- 



stxident would have to be able to set up the problem correctly^ and carry 
out each division to two decimal places; the problem was not one of 
simple estimation as It had first appeared* The committee concluded 
that» considering the complexity of the ltem» student performance was 
satisfactory* 

The procedxires described In this paper are based on the belief that 
the actual assessment items as well as the objectives have to be considered 
when «etting criterion levels since the difficulty of specific it«ii$ 
must^bc. considered* ' However^ concluslocit of strength or weakness have 
been based on groups of Items In objectives or skill areas rather than 
on Individual itema* One of the potential drawbacks of these procedures 
Is that they result in conclusions of strength* satisfactory* or weakness 
for each objective regardless of the number of Items in the objective* 
These results should be Interpreted with great caution when there are 
only 2 or 3 Items in an objective since educators may feel that the objec* 
tlve is Inadequately measured and that no conclusion Is justified* 

B* Combining A Priori Criterion Levels with Professio n al Judgment of 
Assessment Results 

To avoid some of the problems associated with establishing & 
priori criterion levels without sacrificing the advantages* attempts 
were ntade In Maine and Guilford* Connecticut to formally combine profes* 
sional judgments of assessment results with conclusions based on the i 
priori criterion levels* In each case the committee had used consensus 
procedure^ to set minimal acceptable and desired criterion levels* The 
next step was carried out several months later* and the committee members 
did not have access to these ^ priori criterion levels* Each committee 
member was given a student assessment booklet which Included Indications 
of the percent of students who chose each response alternative* Including 
**1 don*t know*'' Each committee member was asked to rate each Item from 
+2 (highly satisfactory) to *2 (highly unsatisfactory) » and then the 
entire group was brought together to discuss the results and reach 
consensus on the Item ratings * During this meeting there was much more 

37 



-31- 



dlscuaalon of arror p«ttenu because actual student results were available 
for «actL foil or response category. These itea ratings were then used 
to determine vhether actual student performance on each objective was 
strongs satisfactory or veak. 

The consensus committees vere then able to look at both their i 
priori classlficicions of strongs satisfactory^ and veak objectives and 
their ex post facto classifications when Interpreting results. It is 
interesting to compare the results of the ^ priori and ex post facto 
procedures. In the Maine statewide mathematics assessment there vere 
several areas identified as weaknesses by both procedures* e.g.» percenter 
fractions and vord problems at age 13. Other areas» such as geometry at 
age 17 » vere considered to he veak vhen the educators looked at the 
actual results although the i priori criteria had resulted In conclusions 
of satisfactory performance. Conversely^ l3'-year^ld performance In the 
area of probability vas judged satisfactory after the fact even though 
average performance on the items vas veil below the minimal acceptable level* 
In several other cases » the conatittee decided that there vere too few 
items to drav firm conclusions about performance. 

In the Guilford science assessment there t/ere tvo axcas identified 
aa strengths by both procedures**-^olution**r elated items and health-^ 
related items. Some areas» such^^s light and electricity^ vere classified 
as strengths by £ priori criterion standards but as satisfactory by ex 
post facto judgments. Generally^ hovever» there vas a tendency for ratings 
to.be higher after the fact. For example^ some areas» such as biological 
science^ vere classified as satisfactory by i priori criterion standards 
but as strengths by ex post facto judgments. Similarly^ there vere 
three areas of veakness according to £ priori criterion standards* but 
no areas of veakness according to ex post facto jaagments. 

The committee members considered the & priori expectations to be an 
important stage In the process of determining strengths and veaknesses 
and Interpreting assessment results. Working closely with the items at 
an early stage In setting S priori criterion levels for the assessment 
provided for great familiarity with the instruments. Also» comparison 
of actual performance vith desired and minimal acceptable levels of 



ERIC 



38 

-32- 



perfonuaQce^ which were established without actual performance results » 
provided valuable reference points in judging the acceptability of 
perfon&anca within objectives* Finally^ corotlttee members identified 
IndlvlduaX items for which performance was well below the desired level; 
they then looked at these items and the types of errors students frequently 
made and came up with ideas for ImprovlnS the curriculum* 



IV* DIRECnONS TOR FURTHER DEm.OPMEtrt 



In addition to developing techniques for judging the acceptability 
of student performance^ the use of teacher predictions of student performance 
for determining needs for ln--service staff development and pre-servlce 
teacher education ere being Investigated* For example » predictions can 
be used to determine if some teachers are poor predictors of student 
performance while others are very accurate predictors^ or if q^st teachers 
^re quite far off on quite a few of the items* Predictions can also be 
used to determine if teachers recognise which skills their students have 
and have not mastered and what types of errors will be most frequent* 

k procedure has been developed which compares teacher predictions 
and actual performance to determine the extent to which teachers recognize 
the relative difficulty of items within each objectlve*^^ In this pro- 
cedure^ a profile of teacher predictions is compared to a profile of 
student performance* The overall distance between the two profiles 
consists of two components: the distance becween the means of the cwo 
profiles and the residual difference between the two profiles when 
adjusted for mean differences* 

Figure 1 shows hypothetical examples of actual and predicted profiles 
for a mathematics objective Involving fractions* There were four items 
in this objective; their p*values ranged from 38Z to 84% with a mean of 
60Z* The profile analysis procedures determine the overall distance 
between the predicted and actual performance profiles* They also determine 
the amount of this difference which Is due to (l) a t;endency to either 
underpredlct or overpredlct and (2) a failure to recognise the relative 
difficulties of the four items* 



ERLC 



17 Dr* Anthony J* Conger of RTI developed this procedure by adapting one 
described in Wiggins > Jerry S* Personality and Prediction: Principles of 
Personality Assessment > Reading, MA: Addison Welsey, 1973* 

39 -33- 



ritur* X : COHPAJUSOK 0? ACTUAL JCID FtCDICnO ?EXn»HAHCC f JtOmXS 



lew 


AcCu«l 




T«4eh«r t 


Tkach«r C 


Twch«r D 


X 


W 


53 


40 


SO 


23 


2 


4S 


30 


40 


25 


;o 


J 


73 


«« 


75 


40. 


u 


4 


•4 


<9 


•5 , 


95 


<9 



^ Actual ?ftr£otwic« 



Actual Hiftit * 60 




X 



2 



3 



1* 



%0 
00 

70 
QO 

30 
4iO 
30 

20 

XO 



ActuAX H*«a " 
rradicttd Hwi - 60 




I 



2 



3 



Actual - 60 

?rtdlct«<t HeM - 50 




1 



2 



3 



Actual Maa© - 60 
Pradiccad Haaft - 43 

BO 
70 



50 
SO 
40 

)G 
20 
10 




Itao 
2 

Taachar D 



3 




40 



-34- 




I 



These exaoples show that teachers A and B had predicted means which 
were Identical to the actual mean* while teachers C and D had predicted 
means which were 10 and IS percentage points below the actual mean» 
respectively. These results do not necessarily mean that teachers A 
and B are good diagnosticians and teachers C and D are poor diagnosticians. 
Table 2 shows that» when you adjusc for the iilfference between means^ 
predicted profiles D and B are closest to^ the actual. prof He (adjusted 
distances of 0 and 8.5» respectively). These results indicate that 
teachers D and B were quite accurate In predicting the relative difficulties 
of the four Items. Teacher C showed the least ability to recognize 
' relative difficulties. For'ebcample^ he predicted that Item 1 would be 

r / easier than Item 3 when In fact many more students were correct on item 
3 than on item 1 (73% versus 38% correct) . Teacher ability U recognize 
relative Item difficulties Is shown graphically in Figure 1 by. the face 
I that the profiles In example D are parallel^ while the profiles In 

examples A and D are clearly not parallel. 



TMhU 1 

A2IAIT5I3 OF THE OI?F£ROC£S BETWEOT 
ACtCTAL AND FR£DICIZD FERJOBMANCC PHOFILES 



Te«ch*r 




Actual 
Htan 


Actual Mian 
Hliuia 


Ovarall 


Kaaa 


Aijust«4 
01stancft_ 


A 


60 


60 


0 


225.0 


0 


223.0 


a 


60 


60 


0 


8.5 


0 ^ 


8.5 


c 


50 


60 


10 


408.5 


100 


308.5 


0 


45 


60 


15 


225.0 


225 


0 



t 



41 



ERIC 



-35- 



J 



Another area currently being investigated Is having teachers predict 
which errors will be most common so these predictions can be compared to 
the actual results* These types of results » as well as the results of 
profile analyses^ can be useful for planning teacher education programs* 
Teachers who are unable to predict performance patterns may also need 
help in diagnosing student learning needs* If certain objectives seem 
to pose problems for many teachers » in-^servlce and pre-servlce programs 
can focus on techniques for diagnosing student needs and Instructional 
materials f<l^^eting these needs* 

Time and budget constraints have almost completely prevented research 
Into the varying performance standards which vould be established by different 
consensus groups working with Identical assessment Items and populations 
of students* However^ one small study of this problem was done In conjunct 
tlon with the 1976 Washington statewide assessment of fourth grade mathe- 
matics * 

The Washington Mathematics Committee was divided Into two separ e groups 

of seven members each tq establish desired and predicted performance levels 

for mathematics Items* The groups were about equally representative in terms 

of university mathematics educators^ teachers^ and administrators* Each group 

was to establish desired and predicted levels for approximately half of the 

lt&QS» but eight Items were assigned to both groups* The groups worked 

* 

Independently after being given the same Instructions* For the eight itsns 
both sets of criterion standards were announced^ and the full committee 
established final consensus standards* 

Table 3 shows the results of the study* While this study was far too 
small and Informal to be conclusive^ the results are encouraging* In esta-^ 
bllshing the desired level for the eight Items the two groups -had exactly 
the same levels for two lt^s» and they were as far as IS percent apart 
only once* In establishing the predicted level the two groups were never 
In exact agreement^ but they were within five percent twice» and they were 
as far as IS percent apart on only three Items* 

/ 



;ric 



42 

-36- 




Table 3 

DESIRED AMD PREDICTED CONSENSUS LEmS ESTABLISHED B7 
GROUP A, GROUP B AND THE COMBINED COMMITTEE 







Desired Level 






Predicted Level 




Item 


Group A 


Group B 


Combined 


Group A 


Group B Combined 


1 


50 


65 


60 


40 


55 


50 


2 


75 


75 


75 


65 


70 


67 


3 


75 


85 


80 


60 


75 


65 


4 


75 


80 


80 


65 


70 


70 


5 


80 


90 


85 


75 


85 


80 


6 


80 


90 


85 


70 


80 


75 


7 


80 


90 


85 


75 


85 


80 


8 


70 


70 


70 


50 


65 


55 



Student performance results are not yet available from the Washington 
assessment^ but it vill be Interesting to see If one group was consistently 
better than the other In predicting Item performance. It will also be 
interesting to see if student performance is'usually closer to the consensus 
of the combined group than it is to that of either Group A or Group B. 



43 
ERLC 

-37- 



REFERENCES 



Airaslaa^ peter W. and Madaus» George F. Criterion-Referenced Testing 
in the Classroom. NOME Measurement In Education , May 1972» Vol. 3 , 
No. 4» 1-8. 

Elliott, Muriel C. Teacher Outcomes Studies: The Development of Methods 
for Obtaining Teacher Estimates of Minimal and Desired Student Perfor- 
mance. Paper presented at the Southeastern Invitational Conference 
on Measurement in Education^ Knorville, December, 1974. 

Jaeger, Richard M. Measurement Consequences of Selected Standard^Settlng 
Models. Paper presented at the meeting of the National Council on 
Measurement In Education^ San Francisco* &pril 1976. 

Meskauskast John A. Evaluation Models for Criterion-Referenced Testing: 
Vlesjs Regarding Mastery and Standard-Setting. Review of Educational 
Research, 1976» 46, 133-158. 

MUlman^ Jason^ Domain-Referenced Measures. Review of Educational Research ^ 
1973» 43» 205-216." " . 

Quirky Thomas ^* Some Measurement Issues In Competency-Based Teacher 
Education. Phi Delta Kappan, 1974, LV, 316-319. 

Smilansky, Jonathon and Guerln» Robert 0. Minimal Acceptable Performance 
Levels for Criterion-Referenced Multiple Choice Examinations znd Their 
Validation. Paper presented at the meeting of the American Educational 
Research Association^ San Francisco, April 1976. 



44 



-38- 



N-ABELS ~ A MANAGEABLE TECHNIQUE FOR MONITORING 
THE ACQUISITION OF ESSENTIAL LEARNING SKILLS 



Harriet A. Egertson and Hugh A. Harlan 
Harriet A. EgertsoN/ Presenter 
Nebraska Department of Education 



Ms. Harriet A. Egertson 
Consultant/ School Management Svs. 
State Department of Education 
233 South 10th Street 
L incoln / Nebraska 68508 

Mr. Hugh Harlan 

Administrator/ School Management Svs. 
State Department of Education 
233 South 10th Street 
L incoln / Nebraska 68508 



45 



-39- 



N-ABELS — A MANAGEABLE TECHNIQUE FOR MONITORING 
THE ACQUISITION OF ESSENTIAL LEARNING SKILLS 



HARRIET A. EGERTSON and HUGH A. HARLAN 
NEBRASKA DEPARTMENT OF EDUCATION 



Introduction 

In the spring of 1974» the Nebraska Department ot Education began 
to develop an assessment Instrument ^Ich focused on skills that could 
be considered essential £or continuing success In school and for learning 
Independence* The product of this work was the publication In the 
summer of 1975 of the Nebraska Assessment Battery of Essential Learning 
Skills* N-ABELS can be described as a goal^orlented teaching Instrument 
with an evaluation component providing a basis for determining acquisition 
of twelve defined skills* Each goal Is stated In terms of an acceptable 
performance standard which clearly describes what the student must do* 

Over half the schools In Nebraska are using the battery on a voluntary 
basis this year and early Information received Is most promising* It Is 
hoped that the use of this battery will have the following effects: 
1) to assure the public that their stated priorities are taken seriously 
by the schools; 2) to help the public accept new programs by assuring 
mastery of essential skills; 3) to answer requests for accountability 
without Imposing legal prescriptions and restraints; and 4) to clarify 
the continuing responsibility of each teacher to work toward competency 
In essential skills for each student* 

The Development of N-ABELS 

Skill Salectlon 

The twelve skills Included in N*AB£LS were chosen by committees of 
professionals in three general areas: communication skills » mathematics 

46 



sklllSt and Inquiry skills* A set o£ pitedetermined criteria for choosing 
skills aided the selection process* The skills In H^ABELS are those: 

1) For which the school assumes the primary Instructional responsibility. 

2) Which are necessary for independence In learning. 

3) Which engender wide public agreement concerning their Importance. 

4) Which are commonly introduced In the elementary school. 

5) Which can be assessed by readily demonstrable student performance. 

6) Which can be assessed without prescribing teaching methodology. 
Each skill Is defined In terms of what the student must do to 

danonstrate mastery and the actual^ tests were constructed to conform to 
these definitions. The test Is at the same difficulty level on all 
forms (Sample and Forms A*D are available)* Because these skills are 
considered Important for a student to be successful In school » they 
should be acquired by students as soon as possible. Of course » most 
students will achieve far beyond this level. The primary purpose of the 
test Is to plnpoln^ Individual students who still need help In learning 
essential skills. Assessment begins In the upper elementary grades 
providing ample time for aiding students with problems In a particular 
skill. Once assessment Is Initiated^ a student continues working on the 
skills until all of them have been mastered. This may take several 
years. As soon as students demonstrate mastery In any one of the twelve 
skill tests» they are finished with that portion and are not retested on 
that skill. 
Communication Skills 

All of the tesns in H-ABELS which assess communication skills are 
constructed from a vocabulary list of 2000 words that was compiled 
during the course of the development of N-ABELS. The decision to use a 



stMdmrd vocabulary list of most frequently used words as a way of 
defining communication skills made It possible to develop tests which 
assess nhe actual spelling and reading knowledge of a finite set of 
Kords* 

Two recent computer studies showed some promise of utility for this 
project* Both concentrate on materials written for children* The 
American Heritage Word Frequency list completed In 1971 was prepared for 
use as the basis for the new American Heritage School Dictionary 
(Carroll^ 1971)* It is based on 1^045 five hundred word samples (a 
total of more than 5 million running words.) from a variety of text and 
trade materials for grades three through nine* Every unique symbol 
which appeared In the samples was Included 3o that the^llst includes 
single letters^ numerals^ proper nouns» Inflected formd» and formulae* 
The words are listed In rank order according to frequency of occurrence* 

Basic Elementary Reading, Vocabularies (Harris, 1972) Is also a 
computerized list of selected materials for children; hovevar. It differs 
from the American Heritage list In several ways* The 4,500,000 running 
words of this list are the entire vocabularies of six basal reader 
series, and two series each In English, mathematics, science, and social 
studies* Three types of lists are provided; 1) core lists contain words 
which occur In a majority of the textual material at each grade level; 
2) additional lists contain words vhlch occur less frequently at each 
level; and 3) technical lists of more difficult words from the subject 
areas of English, social science, science, and math* Inflected forms 
were not Included except for first and second grade* The 7,613 words 
are arranged £.lphabetlcally by grade level so that frequency Is Impossible 
to determine* 4 8 



Since nftlther of these lists precisely fit the needs of H*"ABELS» ic 
vee decided to do & computerized comparison based on the American Heritage 
list* The two lists were consolidated and the H*^ABELS list was deter*" 
mined utfing e set of predetermined criteria* 

Three skill tests are provitied in the communication area* The 
reading skill test requires that the student demonstrate the ability to 
translate printed syinbols into speech by reading aloud a narrative 
selection of approximately 100 words constructed £rom the Hebraska* 
Assessment Bactery q£ Essential Learning Skills 2000 Word Reading 
Vocabulary List* 

In the area of writing two tests are provided* The writing skill 
test is based on the first 1000 words of the H^ABELS Vocabulary and 
requires the student: to write legibly» spell correctly^ and punctuate 
appropriately from dictation a 100 word selection* Criteria for deter* 
mining legibility and correct punctuation are provided* In addition^ a 
supplementary spelling skills cest requires the scudent to spell correctly 
20 of the more difficult words randomly selected from the M-^ABELS 1000 
Word l^iting Vocabulary* 
Mathematics Skills 

The forty-eight Mathematical Competencies and Skills Essential for 
Enlightened Citizens developed by the National Council of Teachers of 
Mathematics was used as a source for the math skills listed in H-ABELS* 
(Edwards^ Hlchols» and Sharpe» 1972) The HCXH competencies and skills^ 
however^ include the full range of mathematics knowledge deemed necessary 
for enlightened cititsenship in a person's life beyond the school* The 
skills included in this composite are those which fit the general purposes 
of N-^ABELS» that is» the skills which students need to know to be able 
to progress Independently in school* 

49 

-43- 



Thft math skills tests require the student to demonstrate the 
ability to read and vrlte positive rational numbers correctly; to 
associate positive rational numbers using decimal » percent, and the 
fractional notation for halves* thirds* fourths* fifths* eighths, and 
renths with concrete or pictorial representations of objects; to vrlte 
the basic facts sums and products from a dictated tape; and use the 
four standard operations of arithmetic for whole numbers and decimal 
fractions^ 
Inquiry Skills 

The purpose of assessing the Inquiry £kllls listed in N-ABELS is to 

demonstrate the student's ability to operationally use the inquiry tools 

cited* There is no effort to assess the ability to understand or 

* 

analyze any ^terlal discovered In reference sources* The assessment of 

s 

such higher level cognitive skills is beyond th. scope intended in this 
Instrument * 

The inx^uiry skills tests require the student to demonstrate the 
ability to locate words in a dictionary* to locate topics in an encyclopedia 
and extend thac soarch to cross-reference items* to use the card catalog 
to find materials on a given coplc, and to use the current Official 
Highway Map of Nebraska to locEte places using cardinal directions* 
identify physical and political features on a map, and to estimate the 
distance between locations using the map scale* As much as possible* 
the tests for these skills Involve demonstration of th^ ability to 
efficiently locate Infor atlon ir^ the sources themselves* 
Administering ABELS 

Although N-ABELS is Intended to be used primarily on an ^ ^dividual 
basis, it is possible to administer s'^me of the tests in large groups* 

50 

-44- 



This is particularly helpful when the process is first initiated* 
Retestlng is generally done on an individual basis* The format is so 
flexible that schools have been able to adapt the administration procedures 
to their particular situations without compromising the Intent of the 
instrument* 

Our field testing indicated that most of the skill tests can be 
administered using materials available In the classroom with little 
expense to the school district* It was also established that the tests 
can be administered on ^ continuing basis with minimal disruption of the 
regular school program* 

Some of the test procedures are unfamiliar} therefore, many students 
will not daaonstrate mastery the first time a skill test is taken* The 
errors made should be reviewed with the student and practice activities 
should be suggested which will help the student gain competence in this 
skill* Each student should be encouraged to keep working toward mastery 
of a specified skill which the teacher feels is most likely attainable* 
If the student is near mastery, retesting should be scheduled within one 
or two weeks* In addition to the school record, each student should 
have a copy of the Stxjdent Progress Report Form so that a record of 
successes can be shared with parents and friends* 



51 



-45- 



References 



Carroll^ J. B.» Davies^ P. &^*£utiman» B.» The American Heritage Word 
frequency Book. New TforVt^^i^ Mifflin, 1971. - 

Dale, E. & Relchert, D., Bibliography of Vocabulary Studies. Columbus^ 
Ohio: Bureau of Educational Research, Ohio State University, 1957. 

Edwards, E. L. Jr., Nichols, E. D. & Sharpe, G. H. ^ Mathematical Competencies 
and Skills Essential for Enlightened Citizens, The Arithmetic 
Teacher. XVIX (November[a^72h^601-607. 

Harris* A. J.* & Jacobson* M. D., tasic Elementary Reading Vocabularies. 
London: Macmillan, 1972. 

Horn* E. A Basic Writing Vocabulary, University of Iowa Monographs in 
Education, Series 1, No..^ (April, 1926), 3-226. 



mm 



52 

-46- 



A PROCESS FOR DEVELOPING, 
IHPieiENTING AND FOLLOWING THROUGH ON AN 
ASSESSMENT PROGRAM IN FIFTH- AND EIGHTH-GRADE MATHEMATICS 



Max Morrison 
Iowa Department of Education 



Dr. Max Morrison 
Di rector: PRE 

Department of Public Instruction 
Grimes State Office Building 
Des Moines r Iowa 50125 



53 
ERIC 

-47- 



A PROCESS FOR DEVELOPING* IMPLEMENTING AND FOUOWING THROUGH 
ON AN ASSESSMENT PROGRAM IN FIFTH- AND EIGHTH-GRADE HATHEMATICS 



MAX MORRISON 



IOWA DEPARTMENT OF EDUCATION 



(Summary) 



Because of « growing conc«^vn over current measuring instruments 
and in response to criticisms regarding student achievement in mathe- 
fltsticSt members of the Iowa Council of Teachera of Kathematics (ICTH) 
and staff from the Iowa Department of Public Instruction initiated a 
program for statewide mathematics assessment in June 1974* It was de- 
signed to collect pertinent and specific data on student achievement 
that could be used by teachers in diagnosing and prescribing instruction* 
To asaist the Department staff in developing and iraplementing the program* 
a nine mender coamittee consiating of classroom teachers » mathematics 
aupcrvlsors and college mathematics instructors was established* The 
committee*s initial role was to eatablish criteria for an effective 
assessment program and to monitor progress* 

The goals of statewide mathematics assessment were: 



1. To provide specific information on each student which could 
be used by the teacher to diagnose each student's strengths 
and weaknesses In mathematics; 

2* To provide objective data for each teacher to furnish the 
basis for planning sequential learning activities for the 
entire class or for each individual in the class; 

3* To provide data for school districts that could be used in 
revising the curriculum and in planning inservice activities 
for the staff; 



To provide benchmark information to tha Iowa Department of Public 
Instruction so that performance trends over time can be studied; 
and 




5* To provide a process which could be replicated by local school 
districts in determining the effectiveness of each curricular 
offering* 

After the committee reviewed procedures used by National Assessment 
in identifying objectives and test items and the techniques used to 
sample students and items» the committee developed the following criteria 
for the Iowa Assessment program: 

1. Participation by local schools in the assessment program would 
be on a voluntary basis; 

A list of minimal performance objectives would be identified 
to insure that important objectives are not overlooked; 

3* Student testing would be limited to the cognitive domain with 
only grades five and eight included* 

- 54 



4. 



Four Itttns would be developed to measure the ittalninent of 
each objective; 



9* The test would differ from a norm*referenced test In that It 
would not be designed for the purpose of comparing one student *s 
performance with that of another* 

6. The entire test battery would be administered to each student 
$0 that the data could be used for diagnostic purposes; 

* 

7. Test Items would be developed which would Incorporate the use 
of recall^ application and analysis skills; 

8. The test would be administered early In the school year to 
enable the teacher to utilize the data In planning Instructional 
activities througliout th'e yeior; 

Test scoring would be the responsibility of the classroom 
teacher with the results reported to the student as soon after 
testing as possible; 

10. Each student's performance would be recorded on a single page 
profile sheet to be developed by the committee; 

11* Assessment Items would be limited to those which could be 
measured by paper and pencil; 

12 « The major focus of the assessment would be to provide the teacher 
with data for making decisions on Individual students; and 

13* Collection of the data at the state level would be for the 

purpose of Identifying problems coomon to a number of schools 
and to provide baseline data which would be used to study 
performance trends over time* 

The state assessment program did not attempt to measure proficiency 
on all desirable mathematics skills and concepts* Many worthwhile 
experiences such as constructing geometric figures using compass and 
ruler or estimating the lengthy height^ or weight of objects In the 
classroom were not Included. 

A set of minimal objectives was Identified for beginning fifth 
and eighth grade students following an extensive survey of current 
textbook content and after revlew1,ng mathematics objectives identified 
by ether states and those Identified In National Assessment* Iowa's 
objectives were based upon skills and concepts deemed essential for 
future success In mathematics » or skills required to deal with solving 
practical problems In everyday life situations « The first list of 
objectives developed were submitted to ISO mathematics teachers through* 
out the state for comments regarding their appropriateness. Revisions 
were made ^d Items were developed to measure student performance on 
each objective* These Items were pilot tested In four school districts 
^ and the test was then revised* 

ERJC 55 



Local school districts were Invited to participate in mathematics 
msaesaaent in March 1975. Requests for participation came from 140 
school districts throughout the state resulting In the distribution of 
some 20^000 fifth grade tests and 22^000 eighth grade tests* 

Mathematics teachers from the participating schools were asked 
to review the objectives and test Items prior to administering the 
test* A part of the review Included getting expected levels of per- 
formance for each class* Each teacher was asked to estimate the per 
cent of students he/she believed would demonstrate mastery on each 
objective* THls Information would then be useful to the teacher In 
analyzing the results as he/she could cotq^are the actual performance 
against the expected level of performance* 

To assure comparability of data» local teachers were requested to 
administer the assessment tests between September 15» and October 17» 
1975* Teachers were to score the tests and record the results on In* 
dividual profile sheets* Individual profiles were to be recorded on a 
class profile or school profile and forwarded to the Iowa Department 
of Public Infttructlon where a state profile was to be developed* As 
participation was voluntary* schools who did not elect to send In class 
or school profiles were not contacted to submit a report* 

A unique feature of the Iowa Assessment Program which distinguishes 
It from programs In other states Is the assistance provided to the 
teacher prior to and following the assessment* Area education agency 
consultants and local school math coordinators arranged pre- and post- 
test assessment activities with local teachers* For cxafiqile» a consultant 
would schedule a meeting of the staff to validate the objectives* 
Teachers would con^are local objectives against those Included In the 
assessment* Consultative assistance was also available to assist the 
teacher In analyzing the results and to determine appropriate Instruc- 
tional activities* 

The State Mathematics Connittee dex^eloped the following aids; 
an assessment handbook to explain the how and why of assessment; a guide 
for diagnosing errors on the fifth grade test; cassette tapes of all 
fifth 4ind eighth grade test items for Individual administration; and 
a list of suggested instructional activities for the measurement strand 
focusing on objectives included in the assessment* Other guides are to 
be developed upon the request of the teachers* 



56 



Cooanents 



Contrary to many of the receht criticisms that students lack basic math 
skills, the results of the iDwa Assessment shw that ^ large majority of 
students have acquired a good foundation In mathematics* Evidence of success 
can he notcil un computation of whole numbers In both fifth &nd eighth grade* 
The per cent of success declined somevhat on computation of fractions, decimals 
and percentages, but students should have an opportunity to further develop 
these skills during the remainder of the school year* 

Performance of eighth grade students on word problems reveal that slightly 
iBore than one-'half of the students were able to apply previously learned skills 
when confronted with a problem solving situation* The lower rate of success 
may be partially attributed to the type of word problems used In the assessment* 
A nucnber of problems required the student to analyze the Information carefully 
in order to discard the extraneous data prior to solving the problem* The 
student's previous experience with this type of problem may have been extremely 
limited* 



Results Indicate that when students are confronted with a word problem 
where they are required to estimate or approximate a reasonable answer, more 
than 50 percent of the students are unable to select the "best estimate"* 
Math programs have not stressed this skill In the past, but with the increased 
dependence upon pocket calculators and other automatic calculating equipment. 
It becomes much ntore crucial to be able to Judge the reasonableness of an 
answer * 

One could speculate further on the results of state assessment, but the 
crucial judgnH^nts regarding the use of the data must be made by the lucal 
schuul staff* Some questions that should be raised by the teachers as a result 
of the assessment Include: 



1) Was the overall performance of students satisfactory? 

2) Which students did not perform satisfactorily? 

3) What were their skill deficiencies? 

4) How serious are these deficiencies? 

5) What are the consequences If nothing Is done about correcting the 
deficiencies? 

6) What resources are available to assist with the problem? 

7) How long will It take to resolve the problem? 

8) What action can be taken to prevent similar problems from occurring? 

9) What skill maintenance activities are necessary for all students? 
10) What other objectives should be included In the assessment? 

Seeking answers to the above questions or a similar set developed by 
the teachers should enable schools to pinpoint the problem and to allocate 
resources to resolve the situation* 

57 



ERIC 



-51 



11 



EDUCATIONAL QUALITY ASSESSMENT 
FOLLOW-UP SURVEY OF THE 197^1 ASSESSMENT 

Joyce S. Kim 
Pennsylvania Department of Education 



Dr. Joyce s. Kim 
Educational Research Associate 
State Department of Education 
Box 911 

Harrisburg/ Pennsylvania 17126 



58 



ERIC 



-53- 



EDUCATIONAL QUALITY ASSESSMENT 
FOLLOW-UP SURVEY OF THE 1974 ASSE§.5MENT 



JOYCE S. KIM 
PENNSYLVANIA DEPARTMENT OF EDUCATION 



Introduction 



On November 9* 1973* the State Board ot Education adopted Section 

5*76» Educational Quality Assessiaent as follows: 

^'During the school years 1973-7A, 1974^75 and 1975*76, the 
Department of Education shall 'use the Educational Quality 
Assessnkent procedure to evaluate the effectiveness of the 
educational program for all school districts in fhe state 
based upon the Ten Goals of Quality Education adopted by 
the State Board of Education* Public schools housing 
approxlmai;ely one*third of the students enrolled In each of 
the three grades 5» 6 and u will be Included In the 
assessment each year*" 



Approximately one*third of the districts (170 districts) partici- 
pated In the 1973**74 assessment undertaken by the Division of Educational 
Quality Assessment (EQA)* The districts assessed during the 1973*74 
school year contained: 





Number 


Number 




of 


of 


Grade 


Schools 


Students 


5 


785 


51,342 


8 


240 


53,326 


U 


191 


48.276 


TOTAL 


1,216 


152^944 



All of the participating districts received their school reports from 
the Division of Educational Quality Assessiaent in the fall of 1974. 



ERIC 



59 

-54- 



OBJECTIVES OF THE 'TUDY 

EQA Is designed to o££er reliable^ statistical information on 
strengths and weaknesses^ from which schools can base sound educational 
planning decisions* Schools are free to use the results as they wish* 
EQA^s function is to provide schools with the starting point for a self* 
analysis of their programs* 

The EQA follow->up survey was carried out in order to ascertain what 
effect the data euid information disseminated by the Division of £QA» have 
had on the local school programs* Some of the questions answered were: 
To what extent have the assessment results been disseminated? What value 
do school districts see in EQA? How relevant are the EQA results to 
educational and planning decisions? 



PROCEDURES — METHODS AND TECHNIQUES 



ERLC 



The follow->up survey was conducted in all districts that had 
participated in the March 1974 assessment* Careful consideration was 
given in selecting one*third o£ the districts (170 districts) for the 
assessments The criteria for selection of a representative sample were: 
size of t^he district^ socioecono;^lc level determined by the financial aid 
ratio and geographic balance* 

In October 197S» the Division of EQA mailed a 20*item Follow-up 

Opinionnaire to superintendents of 170 districts* Replies were received 

from 138 districts by the end of December 1975* As shown in the Appendix^ 

* 

the Opinionnaire included important questions for both schools and EQA Zo 
ascertain reactions to the assessment* Questions in the opinionnaire were 

60 

-55- 



focused on 1) extent of dissemination of EQA results, 2) usefulness of 
EQA data and 3) contribution of the EQA advisory service. Survey results 
are valuable for EQA because the Division Is In the process of 
revising procedures* messurecaent Instruments and condition variables for 
post*1976 assessment. 



RESULTS 

Through the follow-up survey, It was found that a wide dissemination 
of the results had been made. The assessment results have been dlssemln* 
ate4 to various categories of publics and orgaAl2atlons. The approximate 
number of persons Involved In each category were: 



Persons Involved Number 

School board members 974 

Principals 766 

Central office staff 5U 

Most elementary teachers 9,327 

Most middle/junior high school teachers 5,375 

Most high school teachers 7,074 

Local service clubs (Lion, Jaycees, etc.) 596 

PXA, pro, any parent group 7,574 

Students 37,891 

General public 390,536 

Other: Cltl2en Advisory Committee 3,5^3 

Community Advisory Council 

Counselors 

In-service for staff 

Long Range Advisory Group 

Middle States Association 

Newspaper 

Psychological Interns 
Superintendent Lay Advisory Council 
Taxpayer Association 
University Consultants, etc. 

N = lU districts 61 



r o 

ERLC 



-56- 



This dissemination Included many different approaches written 
reports, newsletters, ln*8ervlce presentations, school board meetings, 
etc. In addition, 82 per cent of the districts had prepared press 
releases for their local newspapers. Therefore, It Is evident that EQA 
Infonoatlon Is being shared with public and Is not burled In a school 
administrator's desk drawer. Different methods were used to Inform 
^ others about the £QA report. They Include: 

QUESTION: WHAT METHODS HAVE BEEN USED TO INFORM OTHERS ABOUT 
THE BQA REPORT? 



Methods Used Per Cent 

Special written report 57.2 

School district newsletter 51.4 

Press release 81.9 

Faculty Memorandum 34.1 

Curriculum Bulletin 13.8 

In-service presentations 75.4 

School board meeting 90.6 

Faculty meeting 89.9 

PTA presentation 45.7 

Special presentation 34.8 

Regular meeting 21.0 

Other 7.2 

None 7 



N ^ 138 

Over 50 per cent of the districts Indicated that new programs are 
being established as the result of EQA Information. About 93 per cent of 
thft districts claimed that EQA Information has been reviewed with building 
administrators In program planning. Only 6.5 per cent of the districts 
said that they had not, as yet, used the EQA data. Of these districts, 
most stated that non-use of data was due to lack o£ time on the part of 
district personnel. Only one district felt that the information was not 
sufficiently credible to merit use. Therefore, one can conclude that use 

62 

-57- 



ERIC 



of EQA. d«tA is definitely being made even though use of the dat« Is not 
mandated by the Department of Education. 



QUESTION: WHICH OP IHB FOUOtfING DESCRIBE THE USE MADE OF 
THE BQ/i IMFORMATION?^- 



Use Made - Per Cent 

3^' ■ 

The InConnatlon hu not» as yet» beenoiaad 6.5 

The information was used to reflect the £«vor«blllty 

of our present prognss ^.^....^ 51.4 

A netf prograoi is being plaftned for ont^ nor^ of 

our schools as a result of the infotnatlon 51.4 

Revisions o£ some existing programs are underway 

as a result o£ the Information 60*1 

The Information has been revlmed with building 

administrators for their use In program planning 92.8 

The Information' served as ft basis for teacher 

ln*servlce activity 58.7 

A new program has been ''tried out" In one of our 

schools as a result of the Information 9.4 

A new program has been Incorporated into one 

school's program as a result of the Information 17.4 

A new program has been incorporated Into several 

of our schools as ft result of the information 18.1 

/] 

N ^ 138 



Most school districts (93. 5Z) were satisfied with the EQA Inter* 
preClng team*$ report and more than two*thlrd of the districts Indicated 
that they do not need a follow*up Interpretation session. 

As far as goal priorities are concerned^ the particular goals often 
chosen as the fl/c useful In planning tout of ten) were: Basic 
Sklll5*Verbal» Basic Skllls-Math» Self-Esteem» Citizenship and Interest 
in School and Learning. Each of the basic skills received the highest 
vote (54% of the districts). This means that about 46 per cent of the 
districts did not Include basic skills in the top five. This is probably 

[ due to the fact that they already have information in the basic skills 

I ^ area which they can use in program planning. 

ERIC ..o ^ 

^— -58- 



QUESTION: MARK TME FIVE GOAL ARSVS FOR WHICH THE INFORMATION 
VmiHED SY THE ASSESSHOd WAS MOST USEFUL IN 
FUHNING. 



Coal Areaa Per Cent 

Self Estcttt 51.4 

Ucider»tandl»g Others 34.8 

Basic Skills-Verbal 54.3 

Basic Skills-Math 54.3 

Interest in School and Learning. 44.2 

Citizenship' 40.6 

Health 28.3 

Vocational Attitude 24.6 

Vocational Knotfledga 29.0 

Creative Activities 29.7 

Appreciating Human Accoiaplishments 23.2 

^cparing for a Changing World 24.6 



N * 138 

More than one-half of the districts administered standardized achieve^ 
ment tests (1973-74 school year) at grade levels two through eight. Only 
two out of 138 districts said they had nut given any standardised achieve- 
ment tests. However^ this testing had be^ done in other school years. 
Those two districts explained that they declared a moratorium on testing 
during the 1973-■',i^ school year. 



QUESTION: AT WHICH GRADE LEVEL(S} DID VOU GIVE A STANDARDIZED 
ACHIEVEMENT TEST IN THE 1973-74 SCHOOL YEAR? 



Grade Level Per Cent 

1 48.6 

2 63.8 

3 73.2 

4 71.0 

5 7^.6 

6 83.3 



Grade Level Per Cent 

7 59.4 

8 58.7 

9 41.3 

10 26.8 

11 31.9 

12 15.9 

NONE 1.4 



N = 138 



ERIC 



64 

-59- 



Host districts felt tbat the EQA date have the most relevance for 
chsnge in teaching strategies. Many (more than 80 per cent) also felt 
it was relevant in changing course offerings and course content. 



QUESTION: HOW RELEVAKT IS THE lUroRMATION PROVIDED IN THE REPORT 
TO DECISIONS HHICH MUST BE MADE IN THE FOLLOWING AREAS? 





Very 


Quite 




Not 


No 


Changes Made Xn: 


Relevant 


Relevant 


Relevant 


Relevant 


Answer 


Course offerings 


5.SX 


26.8% 


47. 8X 


15. 2Z 


4.3X 


Course content 


8.7 


35.5" 


43.5 


9.4 


2.9 


Teaching stretegies 


12.3 


37.0 


36.2 


7.2 


7.2 


Teaching asslgnoents 


1.4 


3.6 


29.0 


58.0 


8.0 


Financial allocetions 


3.6 


10.9 


44.2 


34.8 


6.5 


School facilities 


1.4 


9.4 


23.9 


59.4 


5.8 



N 1^ 138 



Some districts entertained suspicions of existing problems, but they 
did not have data to support those suspicions. Over 83 per cent of the 
districts reported that EQA had either provided thea the data to confirm 
such suspicions, or EQA data called attention to problem areas which were 
not previously detected by district staff. 



THE EQA INFORMATION: 

16.1% a) called attention tc a problem area not previously 

noted by district staff. 
52.27. b) confirmed suspicions about district problems. 
13.8% c) did not Identify any serious problems. 

2.9% N6 response. 

10.9% — Combinetion of a and b. 

2.2% " Combination of a, b and c. 



N = 138 

65 



ERLC 



-60- 



It is significant to observe that 61 per cent of the districts 
considered the EQA program as a means of helping them make decisions^ 
and about two-thirds of the districts .believed that the EQA information 
represents a true picture of thttir district. 



QUESTION: HOtf 00 YOU CONSIDER THE EQa PROGRAM AS A MEANS OF 
HELPING XOU MAKE DECISIONS? 

15. 5Z — a) Very useful 

65.6% — b) Useful 

15.6% -- c) Not very useful 

1.7% ~ d) Useless 

1.4% — e) Ifo response 



N = 138 



QUESTION: DO YOU BELIEVE THE EQA INFORMATION REPRESEOTS A TRUE 
PICTURE OF YOUR DISTRICT IN THE AREAS ASSESSED? 

64.5% — Yes 

22.5% — Ho 

13.0% — No response 



N = 138 - ^ ^ * 



About 65 per cent of the districts said they used criterion-reference- 
subscale scores in their explanation of the results, end more than three- 
fourth felt that: Item response frequency data would be valuable If given 
with the Initial report. 

QUESTI^f^ , DID YOU USE CRITERION-REFERENCE SUBSCALE SCORES IN 
dtPUHATION OF THE RESULTS? 




64.5% — Yes 
31.2% -- Ho 
4.3% — No response 



N = 138 



-61- 



QUESTION: WOULD ITD* RESPONSE FREQUENCY DATA BE VALUABLE IF GIVEN 
WITH THE INITIAL REPORT? 



75. 4X -- Yes 
17. a — No 
7.27* " No response 



N =. 139 



Condition variables vhich were collected to Identify the differences 
in resources antong schools cane primarily from students and teachers as 
part oi. their assessment questionnaires. Seventy-one per cent o£ the 
districts claimed that condition variable data Is not harmful. Less than 
four per cent said that teacher response variables and home* education 
and occupation variables are harmful. Another 71 per cent expressed that 
the follow-up booklets on suggested strategies are valuable. 



QUESTION: IS CONDITION VARIABLE DATA MORE HARMFUL THAN HELPFUL? 

10.97. " Yes 

71.0% -- No 

18. 1% — No response 



N = 138 



QUESTION: WERE THE FOLLOW-UP BOOKLETS ON SUGGESTED STRATEGIES OF 
ANY VALUE' 

71.07. " Yes 

4.3% " No 

13.87. " Nbt received 

10.97» No response 



N = 138 



ERIC 



67 



•62- 



Of the 20 Items Included In the Oplnlonnalre^ 4 questions ask about 
suggestions^ cooments and evaluations re^^ardlng EQA's progrdia* These 
queatlons are: 

What do you feel Is the value of EQA? 

- Can you suggeat any programa^ approaches or techniques used In 

your dlatrlct that may account for any high scores you have? 
(Above the 75th percentile or above the prediction band) 

- What type of asalstance would you like from the EQA as a result 

of the assessment? 

- Please add any co&anents you wish regarding the assessment programs** 

Value of EQA 

In general^ most districts felt tl EQA has provided a valuable tool 
for accountability and evaluation to measure the Ten Goala of Quality 
Education. Over two-thirds of the districts said that EQA provides 
objective meaaurement of general effectiveness of school operatlona and 
allows on overall picture of the extent to which students are achieving 
the goals* TWenty-two out of 138 districts <IS*9%) mentioned EQA^s value 
In term of meaaurement for affective domains of education* BQA's £Mllty 
to measure affective goals which can not be measured easily^ especially 
relative to state norms Is remarked by districts* Seven (et cent of the 
districts said that the Instruments used to assess attitude and values 
need close study^ evaluation and revision* Only two districts said that 
there Is no value > 

Suggestion of Program 

Over SO districts presented programs^ approaches or techniques used 
In their own vHstrlcts which might account for any high scores they had* 
The;ie districts had their assessment data above the 7Sth percentile or 
above the prediction c«ind* TWenty-two districts felt that their high 
scores resulted from dedicated teachers^ In-service workshops^ relevant 

68 

-63- 



curriculum^ and varloua Instruction such as Individualized Instruction^ 
non-graded programs and flexible scheduling* Fourteen districts expressed 
that their high scores reflected their emphasis on basic skills and 
Vocational education* Ten districts indicated that a combination of 
coiiammlty» school character! stlcs^and honte background was the Important 
factor* 

Requesting Type of EQA Assistance 

One hundred seven districts responded to questions about their 
Intention of getting assistance from the EQa as a result of assessment. 
More than one-half of^he districts asked assistance In follow-up or 
Implementation strategies* and in developing or obtaining materials which 
would be helpful for enhancing goal areas* O/er 22 per cent answered 
that assistance Is adequate and said to **Just continue the excellent 
work push like hell to keep the staff and to expand the program**' 
Eighteen districts wanted further Interpretation of EQA results* better 
presentation of materials for public relations* In-servlce workshop or 
continued advisory assistance. 

* 

Further Comments Regarding the Assessment Program 

Eighty-five out of 138 districts affirmed additional comments regarding 
the assessment program^ About one-third commended the good work of EQA 
saying* *'Keep up the good work**' "EQA has gained national recogn^.tlon for 
Its worth*** and so forth> Another Chlrd recommended an Improvement of 
Instrument* expansion of resource materials* and utilization of EQA in 
long-range planning. Three districts expressed their opinion for more 
emphasis on cognitive aspects of education. 



69 

** 

-64- 



CONCLUSION 



In gener^^ as shown by the follm*up survey results, there Is a 
strong Indication that EQA results have been disseminated with an Impor*' 
tant Impact on school district programs* The assessment program succeeded 
In providing school districts with information for a salf'-ana lysis of 
their educational programs and planning* Although use of EQA data Is not 
mandated, a wide olsseminatlon of the results have been made by a grt^at 
number of districts* 

Some of the significant facts to support such successful results 

are: 

1) The assessment results have been disseminated to various 
personSf agancles, public relations media and other publics 
with many different approaches as shown on pages 3 and 4« 

2) Most districts felt that EQA data has the^most relevance 
for change In teaching strategies, course offerings and 
course content (pBge 7). 

/I 

3) Most districts considered the EQA program as a means of 
helping them make decisions (page 8)* 

4) A majority of the districts requested assistance in Imple- 
mentation of strategies* 

5) A great num* er of the districts commended the good work of 
EQA with constructive recommendations for a future plan* 

6) Most districts Indicated that EQA has provided a valuable 
tool for evaluation to measure the Ten Goals of Quality 
Education* 

On the basis of all the data^ It Is concluded that the March 1974 
assessment undertaken by the Division of EQA was highly successful In 
meeting Its objectives and In benefits received by the participating 
districts from this evaluation Qp{X)rtunJty* 

The EQA of Pennsylvania Is worthy of continued support to achieve 

Its purpose for a development of a **whole, well-rounded Individual*' In 

70 

the Commonwealth* 

-65- 



EDUCATIONAL IMPORTANCE/IMPLICATIONS OF THE STUDY 



The quality of education has always been a matter of public scrutiny* 
People are more concerned than ever about what children are learning and how 
well they're learning It* Department of Education shares that concern* 

Hundreds of thousands of fifths eighth^ and eleventh grade students In 
public schools have participated In the Department of Education's ongoing 
Educational Quality Assessment Frogram in an effort to me^iture strengths and 
weaknesses among schools throughout the state* It Is Important for both schools 
and £QA to ascertain reactions to the assessment* Thus» the study has resulted 
In significantly important evidences for 1) extent of dissemination of EQA results^ 
2) usefulness of EQA and 3) contribution of the EQA Advisory Service. Also» the 
study results are valuable for EQA in the prr>cess of revising procedures^ 
measurement instruments and condltiQR variables for post-1976 assessment* 



7i 



•66- 



APPENDIX 



Follow-Up Oplnlonnalre to Superintendent 
For the March 1974 Assessment 



72 



ERIC 



-67- 



DISIBXCX 

losmoN 



Educational Qualltj}* Aaaeaaaant 
FoUow^Ip Opinloim&iro to Superlntandeut 
for th6 
March 197A Aaaeasment 



To ubott have your aadesament reaults been dlaaemlnated? (ItvUcate 
the approximate nusiber of people Involved for each category 0 



)i) PTAj FiX), any parent group 

1 ) students 

j| ) general puMJlc 

k) other 



none 



a) achooX board membera 
,b) prlncipala 
e ) ceatral office staff 
d ) most eXcaaentary' teachers 
,.e) most middle/junior H»S. 

teachers 
„,f) most high school teachero 

g) local service clubs 

(LionSi JayceeSi etcO 

What methods have been used to inform others about the EQA report? 
(Hark as many as are appropriate^ 



_a) special written report 
.b) school diatoict nevsletter 
_^c) press release 
.d) faculty maoorandum 
^e) curriculum bulletin 
*f) inservice presentations 
z) school board taeetlng 




) faculty meeting 
i) FIA presentation 
) special presentation 
) retgular meeting with 



.1) other 
fit) none 



Which of the following describe the use made of the BQA information? 
(Check as many as approprii^te.) 

e ) The information has not, a3 yet, been used* 

b ) Ihe information was used to reflect the favorability of our 

present programs* 
,,c) A new program is being planned for one or more of our 

schools as a result of the information* (List goal areas*) 



d) Revisions of some existing programs are underway as a result 
of the information* (List goal areas J 



e) The information has been reviewed with building administrators 
for their use in program planning* 



73 



-68" 



jr) The infonsatlon aerved as a basis for teacher inservlce 
activity. (TAat goal areas. ) 



e) A neir prograa has been '•tried ouV* In one of our schoda as 

a restdt of the Information. 

,h) A nw i^graa has been incorporated into one school's program 

as a result of the information. 
l l A new prograzD baa been incorporated into several of our 
scfaoole as a result of the information 
(If g> h or 1 has been chebk«d> please list goal areas.) 

A. If In Item 3 yo\i cfaaok (a)> please check the statement below uhich 
best describes the situation in your dletrtot. 

, „.„,.,al The Information vas not sufflclantXy credible to merit use. 

, b) District personnel have not had enough time to put the 

information to use. 
„ , , c) The results did not contain any useful Information. 
, ,d) Other (Please describe) 



Was the Intasrpretlng team thorough enough in Its explanation of the 
report? Y es N o I vas not psresent 



6. Should there have been a follow-up interpretation session? 

Y es No 

%. 

7. Mark the five goal areas for ^cfa the information provided by the 
assessment was j&ost useful In planning* 

,^^Vocational Attitude 
, ,,„„yocatlonal Knowledge 

^Creative Activities 

,^„^Dpreciatlng Human 
Acc<Miipllshments 
fo enarlBg for a Changing 
Wbrld 



^Self*^£3te«s - 
Jfnderstandlng Others 
^Basic SkiUs^Verbal 
JSaslc SkiUs^iatheoatics 
^Interest In School and 
l^eamtng 
Citizenship 
llealtb 



8. At which grade level(s) did you give a standardized achievement test 
In school year? (Mark as many as are appropriate) 



Grade Level: 



1) 
7) 
None 



2)- 

8). 



3). 
9). 



A), 
10 



5) 
11 



6) 



74 



•69- 



9. Hov relevant Is the Information providtd In the report to deolalona 
Which must be made In the foliU>idiig araaa? 

Ver7 Quite Hot 
Relevant Edevant EeXevant E«QLOvan 

a) changes In course offerljogs 
bj changes in course content 

c) change in teaching strategies 

d) changes in teaching asalganunts 

e) changes in financial allocations 

f) changes In school faoUltlea 




10. The EQA Infoxnatlon 

a ) called attention to a probX«D area not previously noted by 
district etaff • 

b ) cdoflrmed susplcloos about district probXaBs. 
c l did not IdenUfy any aerlow) probXevs. 



11. How do you consider the BQA program aa a mdans of heXping you malce 
decisions? 

a) very useful 

^b) useful 

c) not very useful 

, , d ) useless 



12. Do balieva the DQA ^nfomation represents a true pictiire of ^our 
district in the areas assessed? les No 



13. 



What do you faeX Is the valxie of Wt 



Can yo^ duggddt aqy prograzaa^ approacbes or tedinlquea uaed in ^ur 
dlBtrlet that naQT account for any high scoroa you hava? (Abovf the 
75th percontilo or aboro tho prediction band) 



15* Would item responsd f^equar,* *■ ^\ be valuable if given vith the 
initial report? l ee ^ ,r 



16* Did you uae criterion reference eubdcale scores in your e^qplanaticm 
of the results? lea ^Ko 



17* Is condition variable data more hamfltl than it is helpful? 
. y es , H o If yes, iMch variables? 



18* Were the foUov-up booldets on stiggested etrategles of any value? 
_. __ J tes H o , M ot received If no, \*ich goal boo!det(s) 

and vhy? 



19* What type of aasietance would you like from the PDE aa a result of 
the asaaasinent? 



20* Please add any coments ycu wish regarding the assessment prograQd. 



76 



ERLC 



'7V 




HYPOTHESIS-TESTING IN LARGE-SCALE ASSES»1T 

Frank Wi Rivas 
National Assessment of Educational Progress 



Mr* Frank W* Rivas 
Associate Writer 
National Assessment 

of Educational Progress 
1860 Lincoln Street, Suite 
Denver / Colorado 80203 



77 

-73- 



HYPOTHESIS-TESTING IN LARGE-SCALE ASSESSMENT 

FRAN^ W, RIVAS 
NATIONAL ASSESSMENT OF EDUCATIONAL PROGRESS 

(Summa ry) 

Large-scale assessment reports provide descriptive data 
which, even when interesting, cannot be used to iiriproVe the 
educational system. Often, however, assassment results are 
not even interesting because they merely confirm our pre- 
suppositions. These are highly subjective observations, to 
be sure, but observations with at least some evidence to 
support them. But rather than defending these statements, 
this paper attempts to analyze one cause for these problems 
and then suggests a remedy. 

One reason for the plodding character of assessment re** 
ports is assessment design itself. Assessments have been de- 
signed to survey skills in broad learning areas rather than 
to test hypotheses about skill development. The fail^ure to 
test^ hypotheses is not inherent in assessment methodology; 
hypothesis-^testing can be incorporated in an assessment with 
relatively few modifications. Since most research in education 
is either theoretical or done with relatively small nonrandom 
samples, assessments provide the ideal vehicle for generalizing 
the results to a large population. Assessments can be used to 
test hypotheses already tested in more limited situations, 
thereby increasing external validity. 

Hypothesis-testing is not entirely new to assessment, but 
the process has thus far been confined to a few isolated in- 
stances* In the national assessment of music, for example, 
an attempt was made to understand problems children might have 

78 

-74- 



with reading music notation* Music educators hypothesized 
that line notation, which used height for pitch and length 
for duration, would be easier to understand than the more 
complex conventions of standard notation* To test this, 
the song "Are you Sleeping?" was represented in both nota- 
tions as follows: 




In fact, students had no more difficulty with the standard 
notation than with the line notation; so the hypothesis was 
not supported* 

The national assessment of writing ;jrovides an example of 
using background variables for hypothesis-testing* To deter- 
mine which variable most affects a student's ability to write 
well, the background form included questions about the number 
of papers the student is required to write, the type and du- i- 
tion of instruction on writing, whether the student typically 
rewrites papers before turning them in and whether returned 
papers include suggestions on how to improve writing* Al- 
though the data have not yet been analyzed, the results will 
be interesting since each of the variables represents what a 
number of experts hypothesize to be an important determinant 
of writing skills* 

79 



*75- 



As these examples illustratet hypothesis-testing can 

* 

yield results both more interesting and, for some audiencesi 
more useful than can broad surveys of skills* There are, 
of course # measurement problems in hypothesis-testing (as 
there are in <«urveys) i but large-scale assessments generally 
employ measurement spec^^ists of more than adequate skill 
for such designs* So the real problem is hypothesis-genera^ 
tion* 

There are three techniques of hypothesis-generation 
available: (a) systematically reviewing the literature and 
interviewing experts in tl^^fel^ldi (b) critically examining 
previous assessment results* and (c) conducting qualitative 
(observational) field research prior to the quantitative 
assessment* Since the first two techniques are probably fa- 
miliar to assessment personnel, the remainder of this paper 
will concentrate on using qualitative field research to gener 
ate hypotheses for large-scale assessments^ 
jf^j^^^ Qualitative research, characterized by firsthand involve- 
ment with the social world, allows one to generate hypotheses 
at least somewhat independent of contemporary theoretical 
models* Basically, there are two modes of observation: watch 
ing what people do and asking them about their actions and 
observations* Both direct observations and interviews range 
in degree of structure from virtually unstructured to strict 
interview schedules and full-scale observational systems* 
It would be convenient if t^ere existed a literature of 



80 

^76- 



qualitative studies with hypotheses applicable to large- 
scale assessments^ but, unfortunately, such a literature 
does not exist. So assessments have the additional bur- 
den of completing this observational research too. 

There are, however, a limited number of qualitative 
studies of education that have suggested hypotheses worthy 
of large-scale investigation. For example, in a study of 
Chicago elementary schools, Hariet Tamage and Robert M. 
Rippey^ were surprised to find that the predicted background 
variables bad little effect on educational achievement. 
Based on their observations ^ they hypothesized that threat 
of failure and the degree of socializing experience were 
more reliable predictors of achievement. If hypotheses 
like these could be supported! in a large-scale assessment, 
the implications for education would be great. 

Threat of failure and degree of socializing experience 
are, as noted above, more difficult to measure than the demo- 
gT*aphic variables most commonly measured in national or 
state assessments, but measurement is nonetheless possible. 
Instruments or at least prototypes for such instruments have 
been compiled in publications like Measurement of Affect and 
Humanizing of Education.^ A measure of threat of failure 



1„ 



Elementary School Cases" in Evaluating Educational Perform- 
ance , ed., Herbert J. Walberg (Berkeley, Calif. r McCutchan, 
I?74) . 

^Salt Lake City: Interstate Education Resource Service Center, 
1974. 

81 



-77- 



could be as sintple as asking elementary c!iildren whether 
they like to display the r schoolvork in the classroom* 

Qualitative research conducted prior to assessments can 
provide hypotheses to be tested in the assessment proper « 
Combining the advantages of qualitative and large*scale 
quantitative studies might well provide a new direction for 
educational research « 



r ^ 



82 

-78- 



A PLAN FOR UTILIZATION OF ASSESSMENT DATA 
BY LOCAL EDUCATION AGENCIES 



John A. Jones and Charles D. Oviatt 
Missouri Department of Elementary and Secondary Education 



Dr* John A* Jones 
Supervisor of Assessment 
Department of Elementary 
and Secondary Education 
Assessment Task Force 
PO Box 480 

Jefferson City, Missouri 65101 

Dr* Charles D* Oviatt 
Director of Assessment 
State Department of Education 
PO Box 480 

Jefferson City, Missouri 65101 



83 



A PUN FOR UTILIZATION OF ASSESSMENT DATA 
BY LOCAL EDUCATION AGENCIES 

JOHN A- JONES and CHARLES D. OVIATT 

MISSOURI DEPARTMENT OF ELEMENTARY AND SECONDARY EDUCATION 



The State Education Agency (SEA) developed a plan for educational 
assessment which included the following phases designed to improve the 
quality of education in the state* 

K Goal Development - Statewide educational gofils and subgoals were 
written and approved by the State Board of Education, 

2. Objective Development - Curriculum committees were appointed to 
develop educational objectives directly related to the statewide 
educational goals and subgoals. These objectives are broad 
enough to be used for curriculum planning on a K-12 basis and 
usually are not behavioral in nature. 

After the statewide educational objectives were written, 
they were subiritted to interested groups in the state for revision 
and ranking for assessment purposes. 

3. Identification of Assessment Purposes - It was decided to develop 
an assessment instrument that tested a wide variety of knowledge, 
skills, and attitudes, rather than to develop a series of narrowly 
defined criterion-referenced instruments. The decision was made 
not to create individual student scores and to use assessment infor*- 
mation only for general program planning purposes at both the SEA 
and LEA levels. 

4. Population to be Assessed - It was decided to develop an assessment 
instrument for the grade 12 level and for the grade six level. 



84 



£>. Instrumentation - The rankings were used in selecting about 
half of the educational objectives to be used as the basis for 
the assessment instrument. A. commercial test development firm 
H^s contracted to develop performance indicators and related 
test items for the selected educational objectives. 

6. Administration - The assessment instrument is administered by 
LEA personnel who have been trained by SEA staff. When statewide 
assessment data are collected* groups of test items were randomly 
assigned to students and administered in a single one-hour 
testing session. 

7. Scoring - All of the assessment test items are multiple-choice 
type items and responses are made on roachine-scorable answer 
sheets. 

8. Reporting and Utilization - A School Assessment Data Sunnnary 

was created for each school involved in the assessment effort for 

use by LEA personnel. Schools may become a part of the SEA 

assessment program by being selected as a part of a statewide 

random sample of schools or by volunteering to conduct a local 

assessment using the statewide assessment instrument. 

The School Assessment Data Summary is composed of two parts* the 

Summary of Data by 5ubgoal and the Summary of Data by Item, The Summary 

of Data by Subgoal reports the frequency of correct and incorrect 

responses and their corresponr^ing percentages for each subgoal. The 

percentage of correct responses (P-value) achieved by a statewide random 

sample of students is also reported for each subgoal. The Sunmary of 

Data by Item reports for each item a description of the item* school 

frequencies of correct and incorrect responses, school item P-value* 

and the statev/ide item P-value. „^ 

oo 

-81 - 



A 6uide for Interpretation and UtiUzation of Assessment Data is 
sent to each school along with their School Assessment Data Sumnary. In 
this guide, a model for curriculum review using assessinent information 
is presented, and a model for curriculum management based on educational 
objectives is described. The attached flowchart shows the logic of 
the model for curriculum review. The basic steps of this model are as 
follows: 

- The LEA decides to administer all or a part of the statewide 
assessment instrument. 

- After the instrument has been administered, the LEA staff and 
other interested citizens identify which of the statewide 
educational objectives should be the responsibilfty of their 
school * 

- Those educational objectives that are judged to be the respon- 
sibility of the school are assigned to the programs and courses 
of the school by the school sL:ff. The school staff is divided 
into committees to interpret the part of the school report for 
which they are responsible. 

- The LEA curriculum committees then rank the objectives for which 
they are responsible according to their curricular importance. 

- The LEA curriculum committees then take the part of the school 
assessment data summary which gives information concerning the 

' objectives for which the committee is responsible. These 

committees then rank these items according to th2 size of their 
P-values and according to the size of the differences between 



86 

-82- 



state and local item P-values. Items that consistently receive 
high or low rankings are used to identify relative strengths 
a»»d weaknesses of students*" performances. These committees then 
write suFimaries of their findings and submit thetn to the school 
administration. 

These cc»nmittee reports are compiled and presented to the LEA 
administration and school board for their review and action. . 
After the processes of interpretation and administrative review 
have occurred, the LEA administration is in a position to release 
the assessment results to local news media. 
If a mandate f^'om the LEA administration is given to the school 
staff, specific program planning may then proceed in revising 
the programs of the school to correct the identified weaknesses 
in student performance. 

Specific program revision may identify needs for new instructional 
equipment and materials which require budgetary expenditures. I^" 
the LEA cannot alter its instructional expenditures, this should 
be announced to the school staff near the b'-ginning of this 
planning process. 

Teachers should instruct stjdents in reference to the domains of 
knowledge described by the educational objectives that represent 
identified weaknesses in student performance. 
After the LEA staff has been involved in creating plans to 
correct identified programmatic weaknesses, long-term leadership 
should be provided to help in increasing the levels of student 
performance in the areas of identified curricular weakness. 

87 



-83- 



Periodic reassessment of student perfomi«tice Is needed to gUe 
LEA personnel Information to make evaluative comparisons con' iming 
the school's progress In correcting identified weaknesses. 

■ 

LOGIC KODa FOR UTILIZATION OF ASSESStiENT INFORMATION 



START 



3 



1 ASSESSMEKT 

nsTim 






2 Sort 
Out Useful 

SUte 
Objectives 






i 3 Assign 
Objectives 
to Progrwts 

1 and Courses 






4 Order 
Useful 
Objectives 
According to 
Importance 
Within Subgoal 







5 EX 
t 

r 


^termi ne 
lesired 
^-value 
Level 






6 Subtract 
Actual from 
Oesired 
Level of 
Performance 



17 Conduct 
Further 
Testing 




88 



A 



A 



-84- 



Determine 
CurrlcuUr 
Strengths 
and Weaknesses 




A 




Ho 



Instructional 
Objectives 



Ti Create 
Ins true tfonal 
Strategies 




13 Budgeting 
and 
Purchasing 




Yes 



|E^ 
I 



89 



•85- 



ACT TEST DATA AND PROGRAM ASSESSMENT 
FOR LARGE SCHOOL DISTRICTS 

Robert Cramer 
Shawnee Mission USD #152. Kansas 



Mr. R* H. Cramer 
Director, Program Evaluation 
U.S.D. #512 
7345 Lowell 

Shawnee Mission, Kansas 66204 



90 

-87- 



ACT TEST DATA AND PROGRAM ASSESSMENT 
FOR LARGE SCHOOL DISTRICTS 



ROBERT CRAMER 



SHAWNEE MISSION USD #152* KANSAS 



U. 5. SCHOOLS A SCANDAL^ TEST SCORES PLtmtET ore recent Kansas City 

Star headlines to articles concerned liith the decrease in academic ability of 
the recent crop of high school seniors as measitred by the highly publicized 
Scholastic Aptitude Testing Program, The natural question for one to ask is, 
**Welly haw about our schools?** This report has been prepared to address such 
a question by providing assessment information in proper perspective or with 
a comparable benchmark* 

For the past three years, the Department of Research and Evaluation has 
been conducting studies in cooperation with the American College Testing 
Prograii. (ACT)* ACT is the **Avis** of the testing industry, but the program 
most camonly used by our sp4dentB and those of comparable districts for col- 
lege placement testing* This project has provided un with a far better bench^ 
rnark of our graduating seniors* learning cmd achievement than anything ue have 
had in the past. This benchmark study has been rmde possible through the 
selection of other schools ard school districts (from Maryland to California) 
similar to ours in many important ways, Tiamely, those with sPudents of the 
Bome kinds as ours — high inacme, suburban districts* 

The spin-^off of this project is particularly relevant in today *s world of 
acoountahility and declining test scores. The aim of such studies is. to bring 
the leadership of the educational system a little closer to an operations 
mnagment philosophy by providing information for data based decision mking. 

In the preparation of this report, state and national results are not 
reported for comparison purposes due to the fact that the test scores of both 
this district and the *^selected** group of schools are far enough above the 
national and state sample to make them irrelevant benchmarks. It should also 
be pointed out that the data for this analysis came from approximately two^ 
thirds of the graduating class each year. 

This report has been prepared in four sections, each of ^ioh provides a 
straightforward y factual attempt to answer the following auestionst 



Section 2, 



How well are our students learning^ 



Section 2. 



How do they feel about their schools? 



Section 5. 



What kinds of help are they asking for? 



Section 4. 



What are their career aspirations? 



91 




-88- 



Section I 



Achievement 

Aehicvcmcnt data fi^cm both the Sha/^ect Mission graduates and those of the selected school 
districts V^ve icci surrtKorn^zcd as ^!eaH ocorcs for amti/sis* TItese l^an ACT scores }tave 
been reported in the areas of Bnglisht Math, Social Science and Scie:.7e, as well as a Com- 
posite* In atMition to a Uvular cmpavison of the scores beix,cen Shames Mission and 
selected schools, the scores of both groups have been displatjed grofhicalltf for trend 
analysis 4 



mAN ACT SCORES 
IN TABULAR FOm 





English 
5. SeU 


Hath 
SM. Sel. 


Social Studies 
SJS. Sel. 


Science 
S.M. Set. 


Composi te 
S.M. Sel.^ 




19.$ 


19.6 


21,$ 


21.9 


21.0 


20.4 


22.$ 


22.9 


21.2 21.$ 


19?$'?4 


19. e 


19.4 


21.0 


21.7 


20>8 


2aa 


22.8 


22.8 


21.2 2U1 


19?4-?S 


19.2 


19.Z 


19.9 


20.? 


20.0 


19.$ 


22.? 


22.9 


20.$ 20.$ 



Mti 9 ACT S C PXt S 



2».t 



t4,tt 



2t,» 



tUTt 




SOCIAL semes 
fun rs-u 




11. i 

li 1 



114 i 
ii.» 



fun f^-fi 



f4'f9 



ecHPosm 

ft^fS fl^li 



fi^n 




ERIC 



92 



•S9- 



In terpre tat%on: 

Enc^Cish scores are consistent with national trends shoijting a very slight 
dcwrward ti*end across three years for both groups* However^ the Shaunee Mission 
student performs equally as well as students of other top schools. 

A steady decline in Math scores is apparent in both groups. Further^ our 
students show a tower level of achievement over the three years ob noted by the 
parallel slopes of the two lines. 

In Social Studies, both groups show a steady decline across years. However, 
Shmmee Mission students show achievement above the selected group across all 
three years. 



Natural Science scores are contrary tc the national picture in that they 
remain nearly constant over the three years. There does not appear to be arty 
significant difference in Science achievement between Shawnee Mission students 
and those of the selected schools. 

The overall composite scores show the same general dotjruDord trend which i^ 
consistent with national trends. However^ the graph shows that our students are 
virtually the same in achievement with their suburban counterparts. 



Suimary: 

In swmary^ the general downward trend ih achievement as depicted nation^ 
ally is also apparent from these results with th^ exception of Natural Science 
and to a lesser degree in Vnglish. When compared to the selected group, Shawnee 
Mission appears to be: 

(1) STBONGEf? in Social Studies, 

(2) Same in Navural Science, 
(Z) Sam in English, 

(4) WEAKER in Math. 



ir^ror^^azion suggests thav Shawnee tHssion maintains at least the same 
degree of excellence In ilatural Science, Social Science, and English programs as 
do other top schools across the country. In Mathematics, the performance of oia* 
students exceeds both national and state averages, but when compared with compa** 
rable school districts, there is an indication of need for improver^ents. This 
fact is also bort.e out by results of the Iowa Tests of Basic Skills at lower 
levels where both Map Skills and Math Skills appear to be weaker among the skills 
tested. 

To provide an example of some of the skills tested by the ACT, sample test 
items for each area are attached. 



93 

■90- 



Section II 

Student Attitudes and Irnpressions Concerning Their School 



For the past tuo yeats^ ACT has provided vcJ'uahle feedback information 
regarding how students feel about their school* This information has been 
Bimnarized from data collected from the students tested including their re- 
sponses to questions of the expressed adequacy of the%r high ^ichool education 
according to their high school curriculum or program; and their degree of 
satisfaction with various aspects of their high school* 

The following two tables sumarize the results of student responses as 
to the adequacy and their satisfaction with the high school program* 

SidTTnarif : 

responses of both groups indicated that their high school education was 
slightly less adequate in 2974'7S than in 19?Z^?4* Shawnee Mission students^ 
however^ considered their high school education to be no more or less adequate 
than their counterparts across thfi country* 

Curiously^ both groups seem to be more satisfied with various aspects of 
their high school in 1974^75 than in 197Z^74^ even though they expressed that 
their high school education to he slightly less adequate* By and large^ over 
the past two yeart Shawnee fission students have similar feelings about their 
school as do other fituburban students across the United States. 



9] 



Expressed Adequacy 



EXCELLENT 



Ska 



Expiessed Adeqjuacy of HS Education 
Accoidinglo HS Cuniculum or Program 



8ut*Comm 

H 



95 



GOOD 



AVERAGE ''"•-•'"•Ji 



BELOW AVERAGE ''w^'JiL 



VERY INADEQUATE 



/J 



17^ 



VoeOccup 



/3 



4 



I 



V/ 



// 



? 



Coll Pr«p 



J/ 



f 7 



f3 



/ 



3 



Gintn) 
orOlbtr 



AvkACT 
Coop 



7 








// 








f/ 






J?/ 














}'/ 


// 












L 


/r 


17 




% 




IS 




7 


if 




/ 




3C 





96 



ERIC 



student SatisfiKtion wHh Various 
Aspects of the Local HS 

SATtSFtEO, NO PRETIY MUCH OtSSATISFIEO. IM- NO EXPOttEXCE 





CHANGE NECESSARY 


NEUTRAt 




















ClassrDom 
Instntctfon 






^ V 




'A7 




/(■ 


— :-^..L 

/ 


J 
/ 






^1 


';v 


>7 










/ 


No. & Variety of 
















COurst OtfcrNifS 










IH 


;/ 




c 


t 






7/ 


a 


n 


/*- 




c 


t 


Gradinc Prac- 




















ticts & Pdlictes 




m 




^% 






c 


t 


No. & Kinds of 












y/ 






1 
















Ttsts Given 










?<- 




/7 


0 


i 








a 










c 


t 


Guidance Services 










3/ 


>f 


47 


























School Mt^ Rff- 




















ulations« & Pbttctes 








?t 




^•<' 




i 


/ 










%i 








/ 


tit^raiy or 




t 








7-1 






Learninc Cmtt 








x\ 


>^ 


X 






it 






y^t 






; 




laboratory 




■ 














/3 


Facilities 








M 


fl 






Frovisions lor Spec 






Tl 




>3 


// 




f 




mp in Readiot 
Uattt. etc. 










M 












U 








/V 


n 






Provisions tor 
















Acad Outst Stti 










X> 




IX 
















.2C 




7 


'7 




Adequacy of Proe in 






%1 












Cduc t Planning 








7¥ 








-4r- 




HO 




%1 


XL 




X3 


5--' 





98 



Section III 
Requests for Educational Assistance 



One task of our schools is to help individuals cope with their om edu-- 

actional problems especially 0u:>se^ inoolving .aspects . . 

which are prime responsibilities of schools and for ti:hioh patrons hold them 
accountable* The following chart shews the percent of students who requested 
eduaaHohal assistance u%th each of the selected program areas* 



Supvnary: 

The greatest number cf students in all groups are asking for assistance 
iHth their educational/vocational pL ^a, i.e., career education* Next in line 
as a priority need for assistance ore ftoth Skills* These needs are also con* 
sistent with the achievement results as well as the expression of student 
attitudes and impressions of their school* 



99 



-94- 



Ksm (F mm w xstm wmiwtL jisststance 



SCIICTD) 

omssirisnEAstHmc 



aitiiiiiiMiiimiunumiiuuiiiiMijiiiiMiiiiBnTi^ 



titiiiti iiimmiiuiiiiiinmiiiiiiiun zr 



iiiiiiiiiimiiimmimunmmHnriiiiriinr-»- 



S«(M£ Mission 

sninrsaus 

ShMC^niSSfON 
SOCCTS) 

fhTiom. 
miHSKIUS 

Stfw€£ flissicn 

SOICTED 
fhTIQM. 

SmcEfbssfoN 

Seuctq) 

KalatAL 



J 1-7* 
?4-Ti 



7 J- J* 



|flJiMjlj,1lll,MMI»IIIMllNI 
lUUJItliftMRJ.MiJUiVlML 



niiii inriiiiiiiiiiTFiimiHi^ 



[>iiii»Miititiiiiiiiiiiiiiimiiiii; _i 



yi^ntifijummiiitiiimiiiiHiu ;g 



yt nimtiiiiiiiiiiiiiiiiniimnmM 



[mia»jiiiJii»u.nH>niMuiiniiiL3[ 



lUUijyiiiiiij««iiiminiiiiijjiii,,j> 



ruHiiiiiiimuitiiiiniiinHiiitfiiinmii r^i;;^ 



nijjj|ii|iiimjji»iiiiiiNiiuiiiinjinii,.y_ 



UiiiiiutN»iutiiui<Miiimmimiijtfmiiiimi_"jr^ 



rj(ijj|iiiiiiiiiijji.iiiiiiiniiii 13 
j|i.i||ij|iiiiiiiiiijiiniiiii_; 



'i )ni»i ii*iimiHi|fn»nitiimi ni i> 



100 



Section IV 



Career AspivaHonB 



This section surrmxrizes the responses of students regarding their career 
aspirations. Table A shows the percentage of students indicating a proposed 
education major in selected career clusters. Table B shotJs the degree aspir* 
aticns of the groups. Table C shows the degree of confidence the students have 
in their decisions regarding their planned education major and their first 
career choice* Table D ehcws the average number of out-^f^lass activities 
for the ACT tested students* 



Swmary: 

In reviewing the results of the career aspirations of the students tested^ 
it was noted that there were close parallels between Shawnee Mission students^ 
proposed education majors and those of the selected schools* Both groups de* 
parted from the national results only in the health professions and social 
science* 

In regard to educational degree aspirations, again there were close 
parallels between the two groups* These indicators confirm the validity of the 
benchmark as being useful for comparison purposes* 

In simrtarizing the results of the confidence students placed in their 
planned educational major and first vocational choice, men showed fairly con* 
eietent results indicating agreement in career aspirations and the educational 
neons to attain their career choice. For women, the 1974*75 groups seem to be 
'^.cre oeri^ain about their plans for education and career choices than the 
1^7S'?4 groups* Overall^ our students are very similar to the selected group 
in eduaatioml plans and career aspirations* This tells us that our counseling 
efforts are as effective as the benchmark group* 

The surmary of the average number of out^of*class activities indicates 
what constitutes an important dL>ension of talent among the students but appears 
to be relatively uncorrelated with achievement scores in English, Math, Social 
Studies end tiatural Science* Nonetheless — they represent a component of our 
students' behavior and interest of extreme isnportanee today* Inspection of the 
data revealed an interesting profile when reflected against the benchmark 
students* Our men students responded as taking part less in athletics and more 
in leadership^ music^ science, work experiences and curiously^ writing* Our 
wmen students take part significantly more in work experiences; in no case 
do they participate less* A special note tJas made of the dramatic increase in 
athletic participation by 2974^75 women students over 2973*74^ and more in line 
with the selected group index* Compared nationally^ both groups show a signifi* 
oant difference* As expected^ our students and the students from the selected 
schools are mch alike but depart from the national profile* Nationally^ men 
students in 1974^75 were much more involved in athletics^ commnity service^ 
leadership^ and speech^ while slightly less were involved in work experiences* 
Nationally^ Mmen students in 2974*75 were more involved in commmity service^ 
leadership^ and music* Other categories were nearly the same* 

One conclusion that ccuZd be drawn is that students in the suburban ochools 
have to expend more time on their academic studies* They buy this time by partic'- 
ivating less in athletics if they are a man^ and less in cormrunity service^ leader^ 
ship^ and music if they are a woman* 

-96- 101 



Table A 
Proposed Education Major 



Career Cluster ShaxM&e Mission leoted** 





7Z^74 


74^75 


7Z^74 


74~7S 


A grzcu I ture/Pores try 




4% 


AW 


AOf 
9» 


Arohitf^oture 




A 


Q 


O 


Biological Science 


> 


A 


c 


i 


Business and Ccwieroe 


lb 


lo 


16 


16 


Comtunica Hons 


o 


4 


3 


3 


Ccmputer & Info* Science 


1 


1 


1 


1 


Education 


13 


10 


11 


10 


Engineering 


6 ' 


6 


S 


S 


Fine Arts S Applied Arts 


8 


9 


8 


7 


ForeigrZ Languages 


1 


1 


1 


1 


Health Professions 


13 


n 


14 


14 


Home Sconor^as 


3 


z 


2 


2 




2 


2 


J 


I 


Physical Science (Physics, 
Chemistry, Geology) 


2 


1 


2 


2 


Mathmatics 


: 


2 


1 


1 


^ormunity Service 


2 


3 


3 


3 


Social Sciences 


11 


12 


11 


10 


Trade, IndustS, Technology 


2 


2 


1 


2 


General Studies 




2 




3 


Undecided 


6 


7 


7 


8 



102 



-97- 



TabCe B 



Eduoationdl Degree Aspirations 



Level 



Shcamee Mission 



74-7S 



"Selected" 



7U7S 



*Vooationai op Technical 
Program (Less than 
two years) 1% 

^7>jo-'^sa.t> College 

Degree 9 

Bachelor's Degree 44 

*One- or Two'Year 

Graduate Study (M.A.^ 

MBS, etc.) SO 

Professional Level Degree 
(PH.D., A/.D., LL.D., 
^•D., etc,) SI 

Other 5 



n 

6 
4? 

SS 

SI 
3 



1% 

8 

41 r 

SS 

SS 
6 



1% 

? 

44 

S3 

SI 
4 



*The levels where these groups deviate from the National profile. 



T. 



ERIC 



103 



■98' 



Tabu C 

StudentM* Confidence in Their PlanKid Sduaaticmt Major 
^fid r£r#£ Career Choice 

C'2 



ShaxMee Mieeion 



"Selected" 





7Z'74 


74^75 


7Z-74 


74^75 


r 


73-74 


74^75 


7Z^74 


74-7$ 


Very Sure 


311 


Z$% 


Z5% 


Z7% 






ZS% 






Fairly Sure 


4$ 


48 




4B 






4$ 




46 


mt Sure 


ZZ 


ZZ 


Z7 


37 






Z4 




38 



^^not broken out b}} ee^e in 73-74^ Bee total* 



C^Z Women 



Shamee Mteaion 



"Selected'' 





73-74 


74^75 


73-74 


74-75 


73^74 


74-75 


7S-74 


74^75 


Verif Sure 


3S% 


34% 


Z9% 


35% 




32% 




38% 


Foirl}f ,Sure 


4$ 


47 


47 


48 




47 




4$ 


Kot Surtt 


29 


' Z2 


Z7 


2$ 




Z2 




Z7 

1 



C-3 TOTAL (Men md Women) 



Shr^e Mieeion ''Selected" 
\^lGnyted Ed^ tfajors^iret Career Choice)^ Ptanned Ed* Major\^Firet Career Choice 





73-74 


74-7S 


73-74 


74-75 


73-74 


74-75 


73-74 


74-75 


Very $t^c 


33% 


32% 


Z7% 


Z8% 


: Z9% 


Z8% 


2S% 


Z7% 


Fairl}f Sure 


4$ 


48 


47 


48 


49 


47 


47 


47 


ftot Sure 


Zl 


2Z 


ze 


Z$ 


I . 


Z3 


Z8 


ZS 

J 



104 



mc 



-99- 



"^1 



I 













' T&tal 




S 


.jr. 






H. 


SeUctei 




t. 


Seleet*d 










74,73 


73*74 


74^73 


73,74 


74,73 


73,74 


74,73 


73,74 


[ 74,7$ 


Art 




*?1 




.73 


l.OZ 


.92 


1.18 


1.1^ 






1.01 


.93 


AthUHcB 






3.11 


2.99 


1.83 


2.10 


9.01 


2.03 


Ti7 


2.44 


2.33 


2.48 






























.SB 


.90 


1.34 


1.40 


1.S2 


1.41 


1.10 


1.1$ 


2.18 


1.27 








1.23 


1.10 


1.47 


1.37 


1.46 


1.22 


1.48 


1.29 


1.3B 


i.ie 




U2$ 


UZ2 


1.09 


1.13 


1.38 


1.64 


1.33 


1.3S 


1.43 


1.43 


1.3$ 


1.39 








.43 


.46 


.32 


.39 


.31 


.30 


.44 


.43 


.$8 


.3? 






.3$ 


.31 


.49 


.90 


.€2 


.73 


.84 


.7% 


.60 


.83 


.37 




























Experience 




2.33 


2.33 


2. is 


1.98 


2.01 


1.84 


1.8$ 


%.2t 


2.23 


2.09 


2.13 






.73 


.€€ 


.83 


.96 


1.0$ 


1.03 


1.00 


.87 


.92 


.86 


.^3 




TYPIOlSCmiCE TEST QUESTIONS 



A u/tk% <ii »r«riAciMt «3i (k^vt^^l to 

COlWtriil m4H tt^OMkt mil fCk'j^^tE tfl * 
Cl(t«c4 MWn 4CM*t mh\th it/Ufllf TtK UtftI 

by ^ 4^ihc r>v Kinged *n *^ 

1 Ik CDOm Mt hcII lUummtlcd. 

Ihc rcwMH ust JjrtE^iK^. it^c K^iT HCft 
wji'td cU»*<4, aihl fidrt^t^vt >wtl rjifac titn\- 
fniiicr\wtftft.i tfi4k^MiiH>iti4)tri(if Ci^ut ithr tocnt. 

«41 th^it t'it\. «CH rci^^Mit vttik ilH tfiiul- 



cnJtH^ vtth the viKn. 



he dntvit thHH «4fcr»mcfii I? 

JJ. Ibt^cii ICC n the dirtc. 
J. Sunt ol <h<ie 



J.Wh.ch «C the fjiWnj roAchi»KM% an ^ 

A, Hilt ttidcnily o*e *mw nd.w 
f HKk ibvm^r«vt 

t** fhC pK>Aie< Of RKbt m^UTL kl» HQ 

sppifrnt cffnt nn ttic )MK 

fcjii hy 4h«tftH:tin|tbcrriutirn| nmitkiif 
lnc4(*A5 nl>jwclrL 

N Which ccpvtimcnt nr iftt^vp i,t ttptt nttnH 
li%it^ hfkiw tttiitti ilijt hdthca*ti»nfmsrU> lly 
latly vtihfkdi u^ir^J! thtitc>^t' 

* 'oftly Jl I >nd .1 

\tAtttntnf AffB HtL'ntHthftmt j'ttniib^t 
tUttuiftt ttttrt/rr, * Mtftt fhi*tf ^nhiy to ttiuj 

A. Ilic tMtcnkrit 4(rvf^ wHh ihf ^n. 
I* Ihc ttjU^nCiit h^'irl^jdictnl bf thcdsti. 
C t^' itjtfment ^nt >e Jihi^d «iihifbi 
tntuv d4t>. 

JJ- Ittc ^tsiejticni n in ctp«nmcii»l 



t««<Hrkg p«>«»g*. Yog m 9ti^**f thsiit 
Vttt rt«l(f ral SdtftCM* 



19. r*e tmcTfCACC «f *e« tliaim of liouKlli** 
<ip4h|! oCwilbMivlinS lh*pt*i*oMwte*ltci* 

K *d^»i>ML » impkmcnuiicil. 

a lh« Mcfkdelift* hw, J, itt?(Hfiiifl*. 

I J. What H lU Hutn eJfeiriKt Ifc* -wn > 



A. M#W<vbK«iitM 

Sh^fvofthtiurtKlrs 
C, Ccomct»»*it'J««tnic*tof*h<(BoVealc* 
ly AvetiQ^ 4iMinre b<i*wn ih« nw»l«f vkt 

12 llvw ili« eflnl rtifj of i#Opiul fej» 

K Hy fhc KlionMilitiM «f fh« renumt f^r 

fciiull fmrinc jinUtuH 
C. I»y ibr<r<HHHiot(aicd»h> hmkI >aJ^ 
IL Hy ihcKcutmiUtwAMultl >nJ ntMhral» 

fftoiHMKd hy (he «a 
J. »x «Mh^rH5t «ifth«|Ui^«i 

IX A «snii hKttt nmytftmc^ to j h^thccihho 
lui jvtJ <vnK ft«4Uihe «s[tt h«ciUK. 

A, *n fftnJ c««)ixtm' fit h<-j( 

H tmut^u'V irofnihi:»»rh«fld^nvt onihtf^lLtn 

J(Hf tv*H ft. 
t'. ^h<er^pA«llitfft*^f•J^fitff^>mthl:»el^|*•* 

jh4ith» hcai^ * 
it. tqtCf It ik*t>«c i*u« stf. 



Jifhd <ti>l*M «ilh ib< mttt^r hnwTtcr. the mmiKcc 



106 



-101- 



WICflL ENGLISH TEST OESnONS 



«*Cidi| hour IMy ctn b« v6mCI«4 <K iffi(HC*ed. th« p«Mt3*« 

C^ttf^ Ot0O4jt# ««eti un««ftif^ pOfton. dv«U fnid « Ut O' 
^»*t»* whom MufflMf coTfMOAtf* t»ih«t or iti« vrmi ponioA. 
CkI» *ft 4* <»tpO<*»t« «oniA*n« • "HO CHANCE" OffltOA ttwt# 

w i mtwt or tp^roooti^AH* ol • rMO<»«««-H «offltitmc*drP«na«fi 
lOMT fO*d^ Mw««tl of tK« tontOACM ■uiiOvf>«irtg M unO«fMn*d 
•orMM. tim rows tiwi>v<)h ih# ^ntirt f^tftv^o QufC^ty to it* 
«*«i*«ftt. T^*fi fwoM trm ttut*^ vomtf *n<t cit^tMiy. A* yog tomato 

••tfrtfcCiHrt m tho i*Bht*n*mj coiumft 4m» <jec>Cf which of vn^ *m 
vrxtft ^ pf^tti^ fs ^«t r<«r (htf^ic^t <0At««t M«%n sue yotf hiM 

««t9*^ vofvioft 1tM«(^ itio roMtlol I* t><4ct(«n His ov«r Ainf iiod 
AofFlft th««4KrtlpOAaiogfCiivontluiVil«niw«r>h««t Ityou l>it^ 
M «n ^ltrA4t4V« V*t«i0A i« V«t OUCVvA ov«l whoM UitMit 
tttmpof*^ to «tt«fMtn« ih^ you (iavO c»ov a ■« bmt in cvcy 
«M. QOrtit^ oAly in* umjtftiAcd vKk^d*, phnt«s, ond Pwnctuaiio<i 
^iir*f: yow CAojMumo mit l^« r«f tot pt$>*c^t« «orf«ct *» wtj^ttn. 



H<irtr4<li1 Ittpnt f^no« for A ■n'4«e 

H^iO INHIt*w&jmHj4 tilled 4r diiArd tc 
Mllkful^lMte t«itfd. AAcf Cftftful it«dy lie 

nim uiW f«rn SmkIi Scit Ihn 

««l&f^l mi 4* fUM tiAd tilt flflftMl 

1 



'wy lii rt««iiih«R4 not «Aly ■» lh*r«tf«tli 
^ fl«>tid>N*i<i44^«tlir>tl«fM4i« 



ihat H niuu be 



(X tike ihcfw wa< 



2 F. .VOCltA\CC 



1 A. KO CHAStiC 
C liit|'mi^»d 



G, ikidiftfMiMKt 



—I — 

1 

1 

imcd tfatii. A» llii «M bvinf l»«<4 4h4 

•f tit* hii^^ 41 ^tilUd iMrf IN •! ■ 

H<><fd;tM M lo ttnittfc lo iL 

— J — 

M^^jmndid ■«( ilia» fittr, tfUMid iNr 

cA«f»^ AbhoufK in«<^li*t n»b 

•tvtby, IM iitc«*v««i*i to thtk 

(c*«*^ flwiiriifimiit 1I0I tbe li^b noi 
9 

4*4 ibtfl Ihfy 4 iNc 



5 A. NOCJfASG£ 
a tf«iL U<^«iM« 
f^.li^ 

c. utiibvithcr 



4F. KOafA\CE 
H. Foibti y W 



f A. KO CHANCE 
Ml WbM ii« nh 



F%tt^M4 

t F. h-nCHASCi: 



J. iMcevo^oT 



t A. KOCHANCE 

O. (M(i|ii fkrw (Rf« 
«fr«f4t> At to. 



M F. \OCHa>'UE 



tt. IC*WML tk 
bvAMC 



N A. SOCIIAnCF 
C juhI thtn 10 



107 



-102- 



TYPICAL SXIAL STIDIES OESTIONS 



03. €<«(MMqr that kcA JMrt«4 ^ «itpaftt**u df 
Mlr*f«ki« ante 1*4 ^ ^liv^iMutuaUM* of 

Sach Sf^i^ CUM* Utmic ch«Aie** ^ 

juoMt*)! m ht ciM ihr ««> ih«proi«»hy of 
Avite «K>Atitn of maa ffodtMiMk, »c mo* 

Xo«n|lt f *fliMt*Hi tw £Ktn r»ttMtti««4 
ttforJW^HttunKMtffA* btd i*f iikt «M«<«k 
Hif|kl|^ of ifkdotWM tifi4 tu%ikt«Ki*K4* Tb« 
oCHinufn «f th* ocfMf kl fihciMffVlti c omtove lo 

cMfU the ^ftOftiitiM of pfWuc(>M 

litKita|vr««lh(fe^fitt04 oHintfactliaYtteAM 
tH piph toJtuliy t4kto (he uilHifte. Tlw *ct lavlt 
tu« Kci J futtcri of ccof[f4pbioil%|t«Toliat»o« 

Miiiiitrt, AnI the |vti^h<«)r iiincn mmm^ 
CirtHTinf, trjfti^oftaiiio, JoJ otkf bl«f<<ottK 



nSt tt iorCot StutUM M^tMj 

n# JJ«<k£ fW£r« iU^Uf U^t ■*->(•■ 

«vutH» na^ffv *9Un4i« 6i t^ci^i 
■Ufew jVif^ o« pttitUit kU^* 

VhJ^Owf Aflfrirtt /o U 
^■■(«io M rti^W pt*4^u t^rt 

^■''titf^f «A.if wJt >«4mt 
|o*1#wiW Xjmiit^ fit «*«£4t 4f r 

4* it 94 Mftv* i^t44Mry ttf mj t^t *Kf 



bwnK toJtMCul ■KH^fitcfM lo MirK f tttitu but 
titSiHKift Utm^ ^U«4 hr tit 

jfuCwK'*) *^ *twn<J for iHo^ 

hk4ol<*H*VOHdilk*inJ i^f^MOmefimittttio 

UtMlioo of J utwwtlwtct ti**c tiutff tM* ^th ro> 

**H^m tfHi^r ftnift utiotrt fitntitwi pioiw 
Icni Sdcc utiu>M| tfk]tM*y kit tHJ t« ihe 

Wl«« lioo* J *it Jciftitfrn hvUdm^ «n j 
«C3le, •UAjr tVbwr^ rirhlly tlw^tti^ AnJ 

t»0» ■ At wjh ic f J Advimn f 4 4 u<t W tTKt 
iMtt ncud fAfdi. i-r«|Wj*il).. if^fbt 



vnlrf lo Mts»iA mate hv^n^. A» * «c»uk. 
atnt wf i«« f naM «ittai hiw »^kcMt AtMltaft 

iM coital »c <»>**JvW i« <fc**»r 'V«oebl 



l> A«orti»t 10 ^ J4tlhor> ■ riteio ^«tH t*tm4 
^ CfAfli^cn of Mfvtco MoMrkt vUt frt^ 
<i|tt^)f «eoatot 

t^mt of Ks^kfiM imI fMct t>t m^tk. 

4w**n the Ut 'wAwt of the HfifU^K 
C AtCH^ tlw Of miooct of ihe mdt^ 

f ii ttooiwoi 
P. imittn^ %M m««l ItMrr lo ihe pet tphtiy, 

XTkt effiettiK to «ulv« (he Aivoriil 

|Cit<tOI* of ■ O«Jtpllill0 «tMiJ t« 

F. «M|«t««MliioatoRMnl<kkl 
<«<rcf»Au^lo»ttt01tl**vWfte. 

J. rt;iM the fiiii4« o'd fW«r 
OfiihofHr. 



.1. !AplMh «f the t^tif*^t)^ ffohScm* il^mi 
f tt<h TiAt cofuttf«fat*oo Oft (h« h**t% M ihc 

A. Cqouiurfvr If jfTic Ui»«co ^feic mi* 

4kA<e JoJ iiof of wOf li 
0. Ili^itJy |uv«*Ctff toifW I<i>«VA too 

C Ca(*«(i(w« ^Mf to heavy tmfc ittflic io 
4**flitw*i Jfcai 

11^ fVMrfli^JifOA of ntfht44 ^if^ 

Harioot 10 ilu«MOwft afVti 



108 



-103- 



TYPICAL WTHEmTICS TEST QUESTIONS 



In w«m«mirtc» lh»g% mt y<w «/• l» 
M0*#« ttch prOblom in^mtn cttQOf« IhA co/* 

«fwnwi«fl«. WftChtn E or K. 



fMit«KT«. Mow mnny tumh of ml jne 



IX ftxvt r/^ 



X In Dft (IfVR bcknr. «hjl h the tsm of lltc 




deiCfOt(4teeatt the ^i^A^nMni'^Jml pined 
3 cvMt the thtfJ imHKh, wM kit Uieuh*c«f 
Uv mAn^ tfk^cttWDt ^^ ilw tuft nf ibc ijkifid 



F. SMS 
H.S31S- 



J. SrMS 



r 

C2 



II, llrnchet 



T« Wilt 4ot% .T n Ibe laXi^tfi^ cuvaliM! 



i . I 



A.1 
1« 



C f 

IX li 



flTKMtfWAml cmk iVte^'t hkfhft. ttThJi n 




All the p««rte rrieftdvof M^AftJ 

H. All ite ^Offe J(« fneittfi M M^nr 

J. AMthepeoflcw|^infirK(iJt«ff^»^'4imiit)( 
«rHitl 

X. All lihr tt]k*»r TwAtf* «f 



t.AtNi|p««ti>ti4wiuinblavt[o<4HtrwlMn 

t«* «(ui(« (tet * (!nnf txtt^ at O mkt. for 
%0m AMAjr milet wit il^ ^>h^ kpuiii wihiit 
AHier of tht fWA? 

at 



A. f 
C 10 



109 

-104- 



AN EXAMPLE OF THE USE OF MULTIPLE MATRIX SAMPLING 
PROCEDURES IN A LOCAL DISTRICT 
ASSESSMENT PROGRAM 



Carl D. Novak 
Lincoln^ Nebraska Public Schools 



Dr. Carl Novak 

Senior, Evalua tor 

Educational Service Unit #18 • 

Lincoln Public Schools 

720 South 22nd 

Lincoln, Nebraska 68506 



lio 



AN EXAMPLE OF THE USE OF MULTIPLE MATRIX SAMPLING PROCEDURES 
IN A LOCAL DISTRICT ASSESSMEf^T PROGRAM 



CARL D. NOVAK 
LINCOLN. NEBRASKA PUBLIC SCHOOLS 



One of the critical lii&itatlons In the conduct of assessment In 
elementary and secondary schools Is the time, es^ense a^id general dis» 
ruptlveneas of the data collection efforts. The dentand tor more and 
more relevant data has been an outgrowth of increased anph&sls on prograa 
evaluation and accounta2>lllty. The cumulative effect has been a 
significant increase In the amount of school time devoted to testing 
and data collection efforts. In many districts the point has been 
reached where teachers and administrators are no longer receptive to 
additional data collection* 

The problem Is coogpllcated by the fact that many of the data 
collection efforts are mdiily cunbersome* Administrators of assess* 
ment programs^ local evalxiators and directors of testing too often fail 
to use sampling techniques effectively* For example » program evalxia* 
tlon typically focuses on the performance of groups rather than the 
performance of Individuals. let evaluation data Is typically collected 
on Individuals and the scores aggregated to estimate gro^ characteristics. 
Similarly assessittent Is typically concerned with the performance of soma 
intact groi9» l*e*» class» school^ district^ or region* let too often 
assessment^ particularly locally administered assessment^ focuses on 
Individuals as the data collection unit. This paper discusses an 
assessment effort In which data vas collected on specific groups^ I.e. » 
grade levels within schools* without administrating the con^lete Instru- 
ment to students* The procedure Involved the use ^f multiple matrix 
sampling* 

111 
-106- 



OklectlVftg of the Stu^ 

The ohjectlves of the study vere to (1) demonstrate the feaslhlllty 
of using ntultlple matrix sampling procedures to efficiently and un- 
ohtruslvely collect assessment data at hoth the district and building 
levels; (2) determine vhether or not data collected through the use of 
matrix sampling is Qtedihle to principals and teachers; and (3) 
test the feaslhlllty of using such asses^unent data to help manage 
curriculum in a highly decentralised district* 
Instrument 

The test used was the American Association for Healthy Physical 
Education and Becreatlon (AAHPER) Cooperative Health Education Test» 
Form 3A» pTft)llshed oy the Educational Testing Service* The AAHFER 
Cooperative Health Education Test vas designee as an end*of-course 
test to measure achievement in health education at the upper*elementary 
and Junior high school levels* Form 3A» ^Ich vas developed for use 
vlth Junior high students^ consists of 60 multiple choice items. The 
items are distrlhuted across eleven content areas; Consumer Healthy 
Community Healthy International Healthy Disease and Disorder^ Personal 
Health Care» Sex Education^ Grovth and Development^ Nutrition^ Mental 
Healthy Drug Use and Ahuse and Safety and First Aid* 

Separate National Norms are provided hy sex for grades 7» 8» 
and 9* Normative information is also provided on each item hy content 
area* 



■ 112 

er|c 

-107- 



Frocedar^a 

the 60 ItetiL^ kQ &lnute test vas randomly divided Into six 10 Item 
statrlx teats* The tests were distributed to all elaiaentary and Junior 
high school buildings In the district where they were adAlnlstered to 
students In conjunction with a noroal minute class period* The 
six testa were placed In random order prior to distribution to schools* 

Separate machine scorable answer sheets were printed to slstpllfy 
scoring and preparation for analysis* The answer sheets were distributed 
after the tests had been distributed* Infoimtlon vas collected on 
school^ grade levels sex» matrix test nusiber^ and Item resp' nses* 
Although the tests were no:t used to evaluate students » namea were 
requested to Insure that student:^ took the test seriously* l;he entire 
procedure took approximately ten minutes* 

The matrix tests were used to estimate (l) the school mean» 
standard deviation^ and percentile distribution by grade level for each 
of the hi participating schools and (2) the district mean* standard 
deviation* and percentile distribution by grade levels Since sex differ-^ 
ences on the AAHPER Cooperative Health tests are significant and since 
separate norms are provided for boys and girls* the distributions for bovs 
and girls were estljnated separately* The mean** standard deviation and 
percentile distribution were subsequently coiaputed for 110 groups*^ 

Approximately 6*000 students (2000 6th graders* 2000 Tth graders* and 
2000 9th graders) participated In the Spring 197? assessment* 



EKLC 



^6th grade boys and 6th grade girls in 32 attendance centers (64 groups)* 

7th grade boys* 9th grade boys* 7th grade girls* and 9th grade girls In 

10 attendance centers {kO groups)* and district boys and district girls 
for grades 6* T and 9 (6 groups)* 

113 

-108- 



In addition to testing the technical feasibility of using multiple 
matrix sampling Instead of census testing, the study focused on two 
practical considerations » l*e* would the results be perceived as credible 
by administrators and building staff and would the results be useful? 

Teachers and administrators within the district wero accustomed 
to data collection in which either the entire Instrument was adminis- 
tered to tvery student (the normal district testing program) or only 
a sample of students was administered the entire Instrumeht* Neither 
of the procedures for collecting aata Is entirely acceptable* The 
district has been making a conscientious effort to reduce the amount 
of required student testliig time to a minimum* For example^ students 
are no longer required to respond to an entire test battery^ but Instead 
are required to take only subtests that teachers and administrators 
Indicate that they can and will use* The use of traditional or student 
sampling procedures reduces the total amount of student test time but 
is even more disruptive than pensus testing* The teacher with Just a 
few students absent for testing is faced with the dilemma of providing 
instruction for the remaining students and making provisions for the 
missing stvidents to catch up* 

The problems associated with the use of -test data are not unique 
to assessment programs* One of the reasons the district is minimizing 
the present testing program is that there is little evidence that test 
Information has been used in the past* In order to facilitate use of the 
Information special reporting procedures were developed* Traditionally, 
tfst score£ are reported by student in terms of grade equivalents » 
percentiles* and stenlnes* School stanlne distributions are also made 

114 

-109- 



available to local building administrators* The reporting procedures 
developed for the assessment data focused on group performance rather 
than on the performance of individuals* School means and standard 
deviations were reported by sex by grade level* District and national 
means and standard deviations were also provided to add perspective 
to the school scores* Xhe percent of students scoring or above the 
23th» 30th and T5th percentile ware provided using both district and 
national norms* Initial plans called for stanine distributions to be 
provided hy sex by grade hy school* The steuiine distributions were 
produced^ however^ nine data points proved to be too many to be con* 
veniently interpreted* The 25th, 50th and T5th percentile system 
represents a more workable compromise* 
Results 

Estimates of the mean score » the standard deviation and percentile 
disTiributions were computed by sex by grade level by school* The 
district mean for girls was higher than the district mean for boys, for 
all grade levels* The district mean for boys was higher than the 
national mean for boys at all grade levels* The district mean for 
girls were very similar to the national mean for girls at all grade 
levels* The means* standard deviation and percent scores for the 
eleven content areas for both the district and national norm group are 
presented in Appendix A* The strengths and weaknesses of students were 
fairly consistent across both schools and grade levels* 

The dean number of students in Junior high groups was 95*3* The 
mean number of students in elementary groups was 33*6* Ten of the elemen* 
tary estimates were based on a relatively small N (20 or less)* The 
estimates based on a small N were consistent with the estimates based 
on more adequate n's* IXo 

-no- 



the Information collected was reported to principals » the super- 
intendent's cabinet^ central office consultants, health education 
teachers and the Board of Education. 

Different reporting formats were developed for each group. The 
Board of Education, for exaji^le» received (1) a one page narrative, 
(2) a table of means and standard deviations for both the district and 
national norm group by grade by sex» and (3) a table of percent scores 
for district and national norm groups by grade by sex for each of the 
eleven content areas of the AAHPER Cooperative Health Education Test. 
A copy of the Board report is attached as Appendix A. 

Principals^ on the other hand» received detailed information on 
their schools. The school results were compared with local district 
norms and national norms. A copy of the report form used with building 
principals is attached as Appendix B. 

The report to the superintendent's cabinet inclxwied: 

1. School by school estimates of mean scores and 
standard deviation by grade level and sex. 

2. School ^y school estimates of the percent of 
students scoring at or above the 7?th» ?Oth 
and 2^th percentile on both local and national 
norms by grade level and sex. 

Health teachers, like principals* received primarily school level 
information, hovever, niore eonphasis was placed on item analysis infor* 
mation and on analysis of performance by the eleven content r'^eas. 

The reports represented the first time quantitative information was 
available on health educat1.on. The reports were well received by all 
groups. The information served to stimulate communication between 
groups interested in health education and was used to identify and 

116 
-m- 



i 

select iDservlce activities for health teachers throughout the year> 
The results were also used to help plan the content of a one veek 
summer workshop for health education teachers* 

As a result of the overall favorable reaction to the Spring 1975 
Health Assessment » the assessment procedures were replicated in 
Spring 1976* Similar assessment procedures have ^een Incorporated 
Into the 1976-77 district testing program* The disirlct health con- 
sultant now strongly adoveates the use of test data In curriculum plan-* 
nlng and is In the process of developing a physical fitness assessment* 

Although the results of the 197^ Health Assessment have not heen 
fully analyzed yet, huildlng level reports have heen generated* 
Prohlems Encountered 

The three most serious problems encountered during the assessment 

were: 

1* Negotiation of contractual arrangements: A licensing agree- 
ment ^.ad to be negotiated which allowed the district to print * 
the matrix test* A set fee of «06^ was charged for each 
matrix test printed* Actually the problems encountered 
negotiating with Educational Testing Service were very minor, 
however; ^he contractual problems have in the past, made the 
use of the matrix sampling procedures very inconvenient* 

2* Data coding problems: The use of matrix sampling procedures 
made It necessarjr for students to record the matrix test 
nuoiber (1-6) on their answer sheets* Since National Norms 
on the Cooperative Health Test are provided separately for 
boys and girls, sex also had to he coded on the answer sheet* 

117 



If either sex or matrix teat number vere miaaing, the student 
responses could not be used* 
3* Small number of students: Since natlcaal norms vere reported 
by sex, septate analysis had to be conducted for girls and 
boys, the number of students In aojre of the small elementary 
schools vas not sufficient to make separate estimates of 
achievement for boys and girls. 
None of the problems, hovever, are serious enough to preclude the 
use of the assessment procedures. For example, 9^% of the ansver sheets 
collected during the 197^ assessment vere filled oat correctly. This 
represents a sllghv Increase over the 1975 assesfnient vhen approximately 
93% of the ansver sheets vere useable.'' The 19T6 scores of sixth grade 
girls at tvo schools could not be estimated because not enough stt^ents 
handed In "useable" ansver sheets.^ Scores vere estimUed fcr 108 of 
the 110 possible groups so failure to obtain ei^tlmates on tvo groups 
represents less than a tvo percent loss of Information. ?en of €h 
sixth grade estimates vere based on less than 20 students, hovever, 
the estimates based on small number of students presented no serious 
problem. 

Related to the problems of small number of students vas the problem 
of negative variance. Negative estimates of the variance make It Impos- 
sible to estimate the distribution. Thla problem vas encountered tvice 
In 1975 and tvlce In 1976 (or less than 2% of the runs>. 



'^An arbitrary cut-off of 15 students vas established for the 1976 assess- 
ment. One of the student groups that participated 1"^ the 197i assessment 
hovever* had only 9 students. Although evidence exists tha^. supports the 
use of matrix sampling procedure vith relatively small N's (20 students 
or more), the use of the procedure vlth very small groups Is questionable 

118 

-113- 



ERIC 



Even the contractiial problems encountered the first year vere not 
aerlous enough to deter potential users* The 1976 aiseisment reused 
the matrix tests that vere printed in l9T^* Perhaps It vould have been 
better to reaample the Items and build new jnatrlx tests* This step 
vould also necessitate printing nev matrix tests and vould Increase; the 
cost of the assessment* In the future^ consideration vlll have to be 
given to reprinting the matrix tests each year* The eventual decision 
vlll have to be tied to both practical and logical conslderatloM* 

It Is wasteful to limit the study to a 6o item wol* Multiple matrix 
sampling is an efficient techniq^ue for use with large numbers of 

items* Increasing the number of items vould also Increase the content 

validity of the assessment effort and provide more useable content 

information* 

Educational Importance/Implication 

The study demonstrated that multiple matrix sampling procedures 
can be used to conveniently collect district assessment data that other- 
wise could not or vould not be collected* Building staff t who vould 
have been reluctant to xtse an entire class period for testing^ vere 
vlUing to administer the tests at the beginning of an otherwise normal 
class period* 

The health assessment data vas subse<j^uently presented to five 
different groups* In every case the informatio. contained in the report 
vas veil received* All group information available ftom the traditional 

standardized testing nro^ram ^as provided through matrix sampling* 
In additioHt information not normally provided in the testing progr a 
(school means t standard deviation^ local normative information) 
also provided* 

119 

-114- 



The inform&tion provided "by tt)1^ assessment was used by the district 
health education consultant to plan curriculum changes and develop in > 



services sessions for health education teachers « The information 
served as a catalyst in discussing the health education curriculum and 
was the focal point of health education reports to the superintendent's 
cabinet* principals* council^ consultants' council and the Beard of 
Education* 

The health education assessment has been incorporated into the 
normal district testing program* and matrix sampling procedures are 
being considered as alternatives to census testing for other district 
assessments « 




120 



-115- 



APFESDIX A 




LINCOLN PUBLIC SCHOOLS 
INSTRUCTIONAL SERVICES 



Report to the 8oard of Education 



December 16, t975 



HEALTH TEST RESULTS 



The AAHPER* Cooperative Health Test was administered Ian spring in all Lincoln 
elementa*v and junior high schools. The test, wh£ch is published by Educational Testing 
Service, was first administered in the Lincoln School District by staff members of the 
Nebraska Center for Health Education, University of Nebraska ^Lincoln, as part^of^a. 
statewide survey of health education at the 8th yrade level. ) t was given in grades 6, 7 and 9 
in Lir>colr; tc collect additi<^at information that would be useful in planning future health 
programs at both elementary and junior high school levels. 

In Order to save both teacher and student tfrre, a sampling procedure was used in which 
each student answered only a few items. The whole process took about ten minutes of 
classroom. time. 

Results are summarized on the attached sheets. Scores are reported for the total test and on 
each of the eleven content areas. Key findings include: 

1. As in national results, girls scored higher than boys at all grade levels. However, Lincoln 
boys scored higher than national norms for boys while Lincoln girls scored about the same 
as national norms for girls. 

2. Lincoln boys scored higher than Lincoln girls on three of the eleven areas: community 
health, pei^nal health care, and safety & first aid. Nationally, girls scored higher than boys 
on all eleven areas. 

3. Lincoln boys scored highest in the areas of consumer health, nutrition, and community 
health, while Lincoln girls scored highest in the areas of nutrition, consumer health and 
growth fit development 

4. Lincoln boys scored lowest in the areas of personal health care, international health, 
and disease fit disorder, while the Lincoln girls scored lowest in the areas of personal health 
care, iriternational health care and mental health. 

5. Areas judged to be of particular importance for which Lincoln students' scores were 
judged to be less than satisfactory were mental health, personal health care, and disease S; 
disorder. 

Dean Austin, Health Education Consultant 
Carl Novak, Evaluator 

Ron Brandt, Associate Superintendent for Instruction 



^American Association for Health, Physical Education, and Recreation 



121 



-116- 



/ 



KtittlU of th* Spring Adrlttlftr&tlon of th* AAHHA C^p«r&tlv* Utftlth 





Uncoln f^^an 




School St. 0*v. 


y-.ttowl St.Sev. 


boy* 


33.3 


ha' 


10.3 


HA* 




35*3 


«a' 


9.2 


ia' 


SHADE T 
Boy« 


36. e 


33.7 


9.8 


12.1 


Glrl9 


36.7 


3B.7 




IX. 1 




i<3.e 


1)2.0 


9.3 


Xl.l 


cirat 




U.6 


e.i 


9.0 J 



'd^tlooml JoxTA 9r.a Stftnddrd D*vl&tlo»9 v^r« net &v&lUbl* for tlit 6th Cnd«. 




Kt«t«iir Xodlcea in Perctnt for Uch of tlie £l«ven Ccmt^nt Ar«o in th« AAnnSR Cooytrmtiinr He*ltb Cducfttioo iMt for 

the ^^tlon&l 4Jaii(pI« tnd tiM School ^strict of Lioeolii« 'AB ^ Gimde hj Sex 
fitrtPK 6 fiRADE T ORADE 9 





BOYS 


CIKI2 


bOIS 


CIWS 


BOYS 


CtfOS 




Uncoln national 
f Van Mean * 


uncoin 
^teen 


iiatlonai 
Mean* 


Uncoln Uavronal Uncoln ^atXonal 
Mean Kean Mean ^tettn 


uncoln tiatioBai UncoTn aatlooai 
Mean Mean Hc*n Mean 


Uoiutner Health 






TO 


ilA 


Ta 


6li 


T6 


T2 


62 


81 


6K 


6i* 


CoMninity Kealtti 


5T 


HA 


56 


NA 


66 


66 


65 


66 


60 


60 


19 


eo 


tnternfttlonal itealth 


iitt 


ilA 


fcT 


HA 


51 


)t3 


60 


56 


66 


63 


6T 


63 


i>iaea3e * Disorder 




HA 


50 


MA 


53 




>tt 


CU 


67 


6T 


Tl 


T3 


Personal ttcKlth Care 




HA 


».T 


HA 


51 


50 


b9 


5T 


62 


61 


00 


66 


Sex Education 




HA 


56 


HA 


5T 


t»li 


63 


5T 


T2 


59 


n 


6b 


drouth * Oevelopnent 


58 


ilA 


£b 


HA 


6T 


5T 


7U 


TO 


T6 


69 


&i 


T9 


.nutrition 


6>i 


JA 


TO 


MA 


Tl 


6T 


Tt» 


T6 


Ti> 


T6 


bk 


65 


■tentftl Health 


53 


NA 


58 


NA 


59 


55 


59 


61 


fall 


65 


6b 


69 


Drug Use i Abuse 


5fc 


HA 


56 


KA 


59 


>9 


61 


66 


Tt. 


T5 


T6 


T9 


Safety 4 First Aid 


63 


HA 


61 


(tA 


6i 


50 


65 


55 


Ta 


6T 


TO 





^H&tlon&l norvm ver« not ftvAllmhle for (th trade students. 



AFPiaiDIX B 



RESULTS OF THE S?R1NG 1976 AOMtNtSTRATION OF THE AAHPER COOPERATIVE 
HEALTH E0UCAT10N TEST AT JUNIOR HIGH SCHOOL 

Mean Raw Score ind Sumhwl D«vbtitfrt on tht GO lltfli Tott 





Sdwol lincotn NiCional 
MtM Mfw Mm 


School Itocoln flationsi 
St*Dt#. $LD«v. St.Dtv. 




CRAOE 7 

Sort 

GMi 
GRAOE 9 

Sort 


3X7 

43 J 41 

44J 444 


9J . lit 

17 11.1 

74 94 



P«fctnt of tool SdtOQl Studtntt $corickg at or aboro tht 
l%1^u SOth, and ZSth Pi»fc«ntilt RctPt«thrily whtn ConiP»rtd to 
fiitioAjl Normt *nd local Normi Grad« t*^ bV S«)tv 



P«fc«nt of Student! 
«t or fbovt the 

GRADE 7 
TSth permtitf 
SOth pcfctni,|« 
SSth ptfcentll* 

GRAOe 9 

TSth Ptr««ntil* 
50th ptfc«nttf0 
25th P'rttntilt' 



School Rctuitt 
CO<np*rtd WHh 



Nitronat Nortnt 
Boy« Girl« 



lirtCQtft Noffnt 
Boy* Girl« 



tiMotA R«tult« 
€onip*ftifWilti 
Natiorul Morm« 
Boy* GtfU 



894 

294 
53J 
735 



S35 
t74 

504 
724 



125 

-119- 



Utm Pltfttm Siartt for S*ett of Hw €lm«t C«nUM Aft«t 
Ntd«(kil SaiM«, th« $<hQ«t Otitrkt ot Lincoln, N€ *nd (ht 



Iht AAHKR Coopff*ti«t H«ilth CdMtion T*tt for ^ 



CRAO€ 7 





aovs 


GIRLS 








Lincofn 


Nfttionat 




School 


Lincoln 


N*tion«f 




of ttim* 




M**fi 


Mm 


^ tttnu 


Mean 


Wtan 


mm 








74 


64 


5 




77 


n 


Commun»tv Htttth 






6$ 


66 


5 




62 


68 


fAt«fnjtion«l Htiith 






53 


43 


3 




53 


56 








55 


52 


5 




67 


64 


Arfto«T«f Hutth C9r4 






53 


50 


7 




51 


57 


SiK €<hK*t]on 






S6 


44 


6 




66 


46 


Crowth & Omlopnwit 






67 


57 


6 




76 


70 


Nutrliion * 






73 


67 


7 




76 


75 


Ment4lHntth 






60 


SS 


4 




63 


61 


: Oru9 UsQ & Abut* 






60 


50 


6 




61 


66 






64 


50 


4 




63 


55 


6nAD€9 




lOVS 


CtRL$ 






School 


Ltncoio 


NationH 


Number 


School 


LiOGOht 


Ketiorut 




of lt«ni» 


Mi Ml 






oMttmt 


Mean 


Mm 


RAeon 


C4n<u<n«r Hulth 
Commvnitv Hf a^th 


5 




60 


81 


5 




83 


84 


5 




79 


80 


5 




74 


80 


tnt*rn*tiontl H*«lth 






65 


63 


3 




U6 


63 , 




5 




66 


67 


5 




72 


73 


FHSontl Health Cart 


7 




61 


61 






62 


66 




S 




70 


60 






76 


68 


Growth & OeMJopcfwtt 






76 


69 






34 


79 




7 




7a 


78 






84 


8S 


Mci^t^l H«dlth 
Ortig Ui« & Abfutt 
Saftt/ & Flrti Aid 


4 




04 


65 






68 


SS 


a 




7t 


75 






75 


79 


4 




7S 


67 






70 


64 















126 



_ 1 0 A 



MEASUREMENT PROBLEMS AND ISSUES RELATED 
TO APPLIED PERFORflANCE TESTING 

James R. Sanders 
Western Michigan University 



Dr. James R- Sanders 
Associate Professor 
Western Michigan University 
Evaluation Center 
College of Education 
Kalamazoo/ Michigan 49008 



127 



-121- 



MEASUREMENT PROBLEMS AND ISSUES RELATED 
TO APPLIED PERFORMANCE JESTING 

JAMES R« SANDERS 

WESTERN MICHIGAN UNIVERSITY 

(Summary) 



Applied Performance Tests have been defined in Sachse and 
Sanders (1975) as "instruments designed to measure performance 
in an actual or simulated setting*" They are measurement devices 
that require an actual or a close approximation of the setting 
to which the performance is expected to be transferred* 

It is the thesis of this paper that the technical criteria used 
in the development and evaluation of applied performance tests are no 
different than those used for any other behavioral measurement devices. 
The unique aspects of applied performance measurement lie in the stress 
given to the degree of realism of stimulus and response conditions. 
Because criteria that are easily met with most psychological measures 
are not met with applied performance tests, some unique measurement 
problems arise. The purpose of this paper, after an initial excursion 
into Che history and theory of applied performance testing, is to define 
relevant criteria used in evaluating applied performance tests and to 
discuss problems of applied performance testing associated with these 
criteria. 

Two criteria used in the development and evaluation of any testing 
device are its reliability and validity; they are related. The reliability 
of a test places an upper limit on the criterion validity of that test. 
This holds true for any measurer^ent device. Another f/^iy of looking at 
the relationship, however, is in terms of the setting in which the measure 



Sachse, T. P* and Sanders* J- R- A Look at Applied Performance Testing 
in Education* Monograph of the Clearinghouse for Applied Performance 
128 Testing* Portland, Oregon: Northwest Regional Educational Lab, 1975* 



taken* Under tightly controlled conditions, the reliability of 
a measure can be very high, but at the price of its validity, 
assuming that the tightly controlled conditions do not reflect the 
ultimate criterion setting* As the testing situation becontes 
more real*life, the validity of the measure can increase, but usually 
at the cost of the control or reliability of the measurement* This 
trade-off is especially a problem when the real performance situation 
is the ideal because the relinbility of measures taken under such 
"noisy" conditions is often quite low. Steps to deal with this problem 
are suggested. 

Standardization of testing conditions is another testing criterion 
that is a problem area in applied performance testing. In real-life 
situations, standardized testing can rarely be achieved* Examples 
of standardized applied performance testing strategies are included 
in the paper as models for those 'vho would use such testing devtcss* 

Sampling er'^ors produce another measurement problem with applied 
performance testing when the examiner wishes to generalize results to 
other settings, times, or persons- Idiosyncratic factors that can affect 
a person's performance in a real-life situation can come from many 
sources. A list of such factors is included in the paper. 

Scoring problems associated with applied performance testing include 
all of those that are typically associated with observation as a means 
of data collection* Training observers to be sensitive to v/hat to 
observe and how to record observations is critical. External distractions 
and the "halo effect" are two scoring problems that must be dealt with 

129 



-123- 



by those who use applied performance tests* Strategies for dealing with 
these problems are discussed. 

A set of problems associated with the cost of applied performance 
testing must also be mentioned when this resting approach is being discussed. 
One of the most serious criticisms of applied performance testing is that 
it costs ^0 much in personnel time» facilities, obtrusiveness in normal 
operations, risk, and logistics--all of which serve to make its 
usefulness questionable. Costs can be reduced, as substitutions for the 
real-life situation are found, but measurement trade-offs, again, 
become a problem (e,g.» reductions in criterion validity). Ways of 
dealing with the cost problem for ass.essmert purposes are suggested 
in the paper. 

Because applied performance tests are often developed and used 
for unique purposes and settings, technical data on such tests are 
often not available. The test user typically does not have an 
available manual that contains information about the adequacy of the 
measurement device. This problem compounds all other measurement 
problems, and suggests that users should attend to, and plan to 
deal with the measurement problems outlined in this paper. 



130 



-124- 



■ 



SYMPOSIUM ON: 
URGE-SCALE ASSESSMENT REPORTING AND USAGE: 
DELAWARE AKD GEORGIA AS EXEMPLARS 

Robert Bigelow and Hervey Scudder 
Delaware Department of Public Instruction 

AND 

Educational Testing Service 



Mr. Robert Bigelow 

Supervisor » Education Planning 

State Department of public Instruction 

Tovmsend Building 

Dover , Delaware 19901 

Mr. Hervey C. Scudder 
Program Director 
Educational Testing Service 
Princeton, New Jersey 08540 

131 



-125- 



THE DELAWARE EDUCATIONAL ASSESSMENT PROGRAM 
FROM THE PEOPLE FOR THE PEOPLE 



ROBERT A. eiGELOW 
DEUWARE DEPARTMENT OF PUBLIC^ INSTRUCTIO^ 



latroductlon 

Ov«r the past five years » the Delaware Educational Asaesanent Program 
(DEAP) has becoote a recognised part of our state*a educational acene* DEAP 
haa aurvtved aad grown In the uddat of Initial paranoia and frequent contto* 
veray* Unlike many other state assesanent efforts^ DEAP la not legislatively 
mandated* Even ao» Delaware citizens and educators have become very much 
aware of DEAF results and their Implications « Each year when the assessment 
results are released » detates about the purpoaes and value of large-scale 
asses&ment are rerun In faculty lounges* local board meetings^ college classes, 
and newspapers* And each year our State Department of Public Instruction be- 
seeches the state legislature and searches Its federal pocketbooka for suf- 
ficient funds to support **another year'* of statewide assessment « 

Of course^ these problems are not unique to our state* But how were 
ve able to survive our growing pains and atlll provide relatively consistent 
aasessment services to Delawareans? In the next few mlnutea» I would like to 
share with you two simple principles that dg^ work In running our state assess- 
ment program: (X) aasessment for Improvement; and (2) acceptance through 
Involvement * 



132 

-126- 



ERIC 



Assessment for Improvement 

In the late 1960*$, Interest Id legislating accountability through man- 
dated testing programs was sweeping the country* At this time, most Delaware 
educators were expressing their concern that accountability through test scores 
alone vould have serious Implications (which many of us have subsequently wit* 
nessed)* For these reasons, the DEAP was Implemented as part of a long- 
range plan for the Improvement of state and local educational programs* Activ- 
ities Implemented through this plan ^re directed toward answering four major 
questions: 

* Vhat do we want from our educational system? 

* Vhat have our students attained? 

* Vhat are our program strengths and weaknesses? 

* Vhat can be done to improve our educational programs? 

As Indicated by the above questions, the formulation of statewide goals 
and objectives was prerequisite to the implementation of needs assessment 
activities* By 1971, statewide learner goals for the 70's and 80*8 vere 
established* So far, statewide educational objectives related to these broad 
goal areas have been developed for the content areas of reading, Engllsh/lan-' 
guage arts, mathematics, science; social studies, and mental and physical 
health* The objectives have been disseminated to educators throughout the 
state In order to facilitate curriculum efforts for kindergarten through grade 
eight students* Terminal objectives for secondary students are currently 
being developed based upon recent Interests In survival skills and minimal 
requirements for high school graduation* 

In 1971, the QEAP was Initiated by the Planning, Research, and Evalu^ 
atlon Division of the department to p^^ovlde state and local educators with 
information to answer the second and third major questions pertaining to 

133 

-127- 



ttudeut attainmtnts and program needs* Over the past five years » nom rafet*- 
encad survey batteries have baen administered to all regular firsts fourth^ 
and eighth grade studants throughout the state* In addition to student ability 
and achievement data» Inforsiatlon about coiamunlty and school resource factors 
has also been collected* 

Each year DEAP results have been generated and distributed In various 
vays to students^ teachers^ educational administrators » legislators^ and the 
com^nmlty at large* Each year local officials have maintained the responsi- 
bility for disseminating their district and school assessment results In a 
manner they felt appropriate* However^ from the very beginning^ the expected 
misuses of the data occurred: unfair achievement comparisons were made between 
schools and districts^ high scoring districts were cited as "the place to llve*'» 
and teachers began to be concerned about the effect of test scores upon con<^ 
tract negotiations* In efforts to dissipate these concerns^ the department 
has focused all assessment reports^ field services » and publicity on the pri- 
mary purpose for the DEAP ~* to generate Information about local and statewide 
educational needs as a basis for future program lo^rovement*. 

Over the years» obvious misuses of the assessment results seem to be 
diminishing* to the best of my knowledge » there has been no maas movement 
to high scoring districts and no teacher has been fired because of assessment 
results* On the contrary^ most teachers partlclPatlrig In the DEAP have had 
multiple opportunities to understand DEAP results for their school and dls^ 
trlct* Local school board members and chief school officers annually re>»lew 
their district achievement results In light of community and educational re- 
sources* Newspaper articles have begun to report district programs being 

Implemented to alleviate curriculum weaknesses indicated by the survey tests* ^ 

Finally, an Increasing number of project applications utilising DEAP results 
In their needs assessment sections are being generated each year* These 



'128- 



134 



proposals represent millions of dollars that are being allocated yearly to local 
districts to support progri^ms meeting educational needs indicated by the DEAP 
results * 



lERjC 
I 



Acceptance Through Involvement 

Another reason that OEAP enjoys increasing acceptance is that each phase 
of the program is largely determined through concrete inputs from those it 
serves* Examples of the important kinds of inputs that DEAP participants con^ 
tribute to each of the major assessment activities are discussed below* 

Refinement of learner objectives * Every year representatives from local 
school districts^ higher education^ and the department have been asked to par* 
ticipate in the continued refinement of statewide educational objectives* 
These people are organized intc subject area task forces whose mission is to 
annually revise and/or rewrite the existing statewide educational objectives* 
This process can take many weeks since preliminary drafts of these objectives 
must first be reviewed by local educators before final consensus is reached 
through a modified Delphi technique* Even though the statewide objectives are 
still incomplete^ the yearly outcomes are of primary importance — updated 
sets of objectives developed and approved by Delaware educators* 

Development and/or revision of assessment instruments * The DEAP bat- 
teries used to collect information about student ability and achievement are 
also annually reviewed* Instrument development and/or revision is based upon 
inputs from three sources: (1) item specifications recommended by the subject 
area task forces; (2) teacher comments about test content collected during 
test administrations and inservice sessions; and (3) professional contributions 
from our contractor^ Educational Testing Service* In one case» Delaware fourth 
and eighth grade social studies tests were constructed^ piloted^ and refined 
through the joint efforts of social studies task force members^ reading spe- . (j.f^ 

cialistft, department staff, and ETS* This year, selected members from each 

-129- 



subject ana task force vlXl be trained to generate their own iten alternative:^ 
to specifications recowoended during a suoSiAer workshop* eTS staff vill then 
review these Items and reconsaend final Selections for inclusion in the revised 
teata* In this way» DEAP participants should feel even stronger ownership of 
the revised assessment Instruments* 

Administrative logistics * Managing the flow of assessment materials be- 
comes more efficient with each year of the program's operation* There 1* not 
enough time here to adequately describe the kinds of cooperation among our 
district test coordinators^ which has enabled us to deliver^ administer^ 
assemble, ship, and score assessment materials for every school district within 
20 vorV^ing days* ETS has also pro vJ led valuable asslstanct^ in this effort as 
will be described during the next presentation* Bitting operating costs have 
recently forced the department to take over many of the logistical activities 
previously conducted by EtS* However, ue have still been able to meet our 



contractor and the continued support of our local test coordinators* 

Reporting assessment results * DEAP results are distributed to and 

utilized by many audiences* Our primary' targets h^ve always been classroom 
teachers, curriculum managers, and principals of each part'^.clpatlng school* 
Through mini projects Implemented In each school district, these people learn 

to interpret the assessment results in light of their own curriculum outcomes 

9 

and actually use the data to Identify further areas of Investigation and 
Improvement* In turn, demands for more specific information about curriculum 
outcomes and corrective action have generated local requests for In depth 
assistance from the department's dissemination unit and t'leld service staff* 
At another level, results from a longitudinal Investigation of school resources 
and achievement has prompted a request by the State Board of Education for an 
In depth follow"Up study of schools performing above or below expectations* 




successful management patterns developed jointly with our 



136 



Finally^ analyses of schlevement results have Indicated Increasing differences 
between Delaware and national norms students progress through the elementary 
and middle grades* This downward trend has been of concern to many educators 
and citizens throughout the state* These concerns are being voiced in profes* 
sional and public demands to expand DEAP services to provide more Information 
about student attainment of each statewide objective* 

The Future of PEAP 

We are initiating a new phase of our assessment program* Every two 
or three years the traditional norm referenced batteries will i>e cycled in at 
each participating grade level for benchmark studies of long-term changes* In 
addition^ we plan to annually administer objective referenced measures in 
selected content areas beginning next fall* We anticipate that these new 
measures will supply more complete information about student needs in each 
subject area assessed* This additional information should also reinforce fur* 
ther diagnostic efforts at the local level to help practitioners better address 
the final and most important question of our long-range plan (What can be 
done to improve educational programs?)* 

Ve are later than other states in entering the arena of objective 
referenced measurement* However^ we are finally ready; our people understand 
it» and our people want it* 



137 



-131- 



SELECTED BIBLIOGRAPHY 



Additional papers, handouts or brochures which 
were not submitted or presented' as part of the formal 
paper sessions were made available to participants 
during the three-day program* Because these materials 
cover topics of general interest to assessment person- 
nel they are being referenced here so that interested 
readers will be aware of them* 

Free single copies of the materials titled below 
can be obtained by writing directly to the author/ 
presenter* 



LINCOLN PUBLIC SCHOOLS* POSITION PAPER ON ASSESSMENT 

* Dr* Ron Brandt 

Associate Superintendent for Instruction 

Lincoln Public Schools 

PO Box 82889 

Lincoln, Nebraska 68501 



THE USE OF CORRELATES OF ACHIEVEMENT: 
CHANGEABLE AND UNCHANGEABLE 

Dr* Paul CaApbell 
Director, ESS 

Educational Testing Service 
Princeton, New Jersey 08540 



SUCCESSFUL UTILIZATION OF STATE ASSESSMENT RESULTS BY LEAs 

Dr. Robert Coldiron 
Chief, Division EQA 
State Department of Education 
Box 911 

Harrisburg, Pennsylvania 17126 



138 

-132- 



TEXAS CAREER MEASUREMENT SERIES: 
FROM ASSESSMENT TO INSTRUCTION 

Mr* Keith Cruse 
Program Director, Assessment 
Texas Education Agency 
201 East 11th Street 
Austin, Texas 78701 



1975-76 MICHIGAN ASSESSMENT PROGRAM: PARENT REPORT 
AND CITIZENS GUIDE PROJECT REPORT 

Mrs* Judith E* Moyer 
Education Research Consultant 
State Department of Education 
PO Box 420 

Lansing^ Michigan 48902 



DISTRICT USE OF ASSESSMENT RESULTS 

Dr* Alan Robertson^ Chief 
Division of Research 

Bureau of Occupational Educational Research 
State Education Department 
Albany, New York 12234 



A METHOD FOR EVALUATING ASSESSMENT 

REPORTING THE RESULTS OF STATEWIDE ASSESSMENT 

SETTING STANDARDS AND LIVING WITH THEM 

Dr* Lorrie Shepard 

Laboratory of Educational Research 

University of Colorado 

Boulder, Colorado 80302 



OAKLAND MICHIGAN TESTING HANDBOOK 
Dr* Richard Watson 

Oakland Intermediate School District 
2100 Pontiac Lake Road 
Pontiac , Michigan 48084 



139 



-133- 



